国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > JavaScript > 正文

淺談Node.js之異步流控制

2019-11-19 15:05:23
字體:
供稿:網(wǎng)友

前言

在沒有深度使用函數(shù)回調(diào)的經(jīng)驗(yàn)的時(shí)候,去看這些內(nèi)容還是有一點(diǎn)吃力的。由于Node.js獨(dú)特的異步特性,才出現(xiàn)了“回調(diào)地獄”的問題,這篇文章中,我比較詳細(xì)的記錄了如何解決異步流問題。

文章會(huì)很長(zhǎng),而且這篇是對(duì)異步流模式的解釋。文中會(huì)使用一個(gè)簡(jiǎn)單的網(wǎng)絡(luò)蜘蛛的例子,它的作用是抓取指定URL的網(wǎng)頁內(nèi)容并保存在項(xiàng)目中,在文章的最后,可以找到整篇文章中的源碼demo。

1.原生JavaScript模式

本篇不針對(duì)初學(xué)者,因此會(huì)省略掉大部分的基礎(chǔ)內(nèi)容的講解:

(spider_v1.js)

const request = require("request");const fs = require("fs");const mkdirp = require("mkdirp");const path = require("path");const utilities = require("./utilities");function spider(url, callback) {  const filename = utilities.urlToFilename(url);  console.log(`filename: ${filename}`);  fs.exists(filename, exists => {    if (!exists) {      console.log(`Downloading ${url}`);      request(url, (err, response, body) => {        if (err) {          callback(err);        } else {          mkdirp(path.dirname(filename), err => {            if (err) {              callback(err);            } else {              fs.writeFile(filename, body, err => {                if (err) {                  callback(err);                } else {                  callback(null, filename, true);                }              });            }          });        }      });    } else {      callback(null, filename, false);    }  });}spider(process.argv[2], (err, filename, downloaded) => {  if (err) {    console.log(err);  } else if (downloaded) {    console.log(`Completed the download of ${filename}`);  } else {    console.log(`${filename} was already downloaded`);  }});

上邊的代碼的流程大概是這樣的:

  1. 把url轉(zhuǎn)換成filename
  2. 判斷該文件名是否存在,若存在直接返回,否則進(jìn)入下一步
  3. 發(fā)請(qǐng)求,獲取body
  4. 把body寫入到文件中

這是一個(gè)非常簡(jiǎn)單版本的蜘蛛,他只能抓取一個(gè)url的內(nèi)容,看到上邊的回調(diào)多么令人頭疼。那么我們開始進(jìn)行優(yōu)化。

首先,if else 這種方式可以進(jìn)行優(yōu)化,這個(gè)很簡(jiǎn)單,不用多說,放一個(gè)對(duì)比效果:

/// beforeif (err) {  callback(err);} else {  callback(null, filename, true);}/// afterif (err) {  return callback(err);}callback(null, filename, true);

代碼這么寫,嵌套就會(huì)少一層,但經(jīng)驗(yàn)豐富的程序員會(huì)認(rèn)為,這樣寫過重強(qiáng)調(diào)了error,我們編程的重點(diǎn)應(yīng)該放在處理正確的數(shù)據(jù)上,在可讀性上也存在這樣的要求。

另一個(gè)優(yōu)化是函數(shù)拆分,上邊代碼中的spider函數(shù)中,可以把下載文件和保存文件拆分出去。

(spider_v2.js)

const request = require("request");const fs = require("fs");const mkdirp = require("mkdirp");const path = require("path");const utilities = require("./utilities");function saveFile(filename, contents, callback) {  mkdirp(path.dirname(filename), err => {    if (err) {      return callback(err);    }    fs.writeFile(filename, contents, callback);  });}function download(url, filename, callback) {  console.log(`Downloading ${url}`);  request(url, (err, response, body) => {    if (err) {      return callback(err);    }    saveFile(filename, body, err => {      if (err) {        return callback(err);      }      console.log(`Downloaded and saved: ${url}`);      callback(null, body);    });  })}function spider(url, callback) {  const filename = utilities.urlToFilename(url);  console.log(`filename: ${filename}`);  fs.exists(filename, exists => {    if (exists) {      return callback(null, filename, false);    }    download(url, filename, err => {      if (err) {        return callback(err);      }      callback(null, filename, true);    })  });}spider(process.argv[2], (err, filename, downloaded) => {  if (err) {    console.log(err);  } else if (downloaded) {    console.log(`Completed the download of ${filename}`);  } else {    console.log(`${filename} was already downloaded`);  }});

上邊的代碼基本上是采用原生優(yōu)化后的結(jié)果,但這個(gè)蜘蛛的功能太過簡(jiǎn)單,我們現(xiàn)在需要抓取某個(gè)網(wǎng)頁中的所有url,這樣才會(huì)引申出串行和并行的問題。

(spider_v3.js)

const request = require("request");const fs = require("fs");const mkdirp = require("mkdirp");const path = require("path");const utilities = require("./utilities");function saveFile(filename, contents, callback) {  mkdirp(path.dirname(filename), err => {    if (err) {      return callback(err);    }    fs.writeFile(filename, contents, callback);  });}function download(url, filename, callback) {  console.log(`Downloading ${url}`);  request(url, (err, response, body) => {    if (err) {      return callback(err);    }    saveFile(filename, body, err => {      if (err) {        return callback(err);      }      console.log(`Downloaded and saved: ${url}`);      callback(null, body);    });  })}/// 最大的啟發(fā)是實(shí)現(xiàn)了如何異步循環(huán)遍歷數(shù)組function spiderLinks(currentUrl, body, nesting, callback) {  if (nesting === 0) {    return process.nextTick(callback);  }  const links = utilities.getPageLinks(currentUrl, body);  function iterate(index) {    if (index === links.length) {      return callback();    }    spider(links[index], nesting - 1, err => {      if (err) {        return callback(err);      }      iterate((index + 1));    })  }  iterate(0);}function spider(url, nesting, callback) {  const filename = utilities.urlToFilename(url);  fs.readFile(filename, "utf8", (err, body) => {    if (err) {      if (err.code !== 'ENOENT') {        return callback(err);      }      return download(url, filename, (err, body) => {        if (err) {          return callback(err);        }        spiderLinks(url, body, nesting, callback);      });    }    spiderLinks(url, body, nesting, callback);  });}spider(process.argv[2], 2, (err, filename, downloaded) => {  if (err) {    console.log(err);  } else if (downloaded) {    console.log(`Completed the download of ${filename}`);  } else {    console.log(`${filename} was already downloaded`);  }});

上邊的代碼相比之前的代碼多了兩個(gè)核心功能,首先是通過輔助類獲取到了某個(gè)body中的links:

const links = utilities.getPageLinks(currentUrl, body);

內(nèi)部實(shí)現(xiàn)就不解釋了,另一個(gè)核心代碼就是:

/// 最大的啟發(fā)是實(shí)現(xiàn)了如何異步循環(huán)遍歷數(shù)組function spiderLinks(currentUrl, body, nesting, callback) {  if (nesting === 0) {    return process.nextTick(callback);  }  const links = utilities.getPageLinks(currentUrl, body);  function iterate(index) {    if (index === links.length) {      return callback();    }    spider(links[index], nesting - 1, err => {      if (err) {        return callback(err);      }      iterate((index + 1));    })  }  iterate(0);}

可以說上邊這一小段代碼,就是采用原生實(shí)現(xiàn)異步串行的pattern了。除了這些之外,還引入了nesting的概念,通過這是這個(gè)屬性,可以控制抓取層次。

到這里我們就完整的實(shí)現(xiàn)了串行的功能,考慮到性能,我們要開發(fā)并行抓取的功能。

(spider_v4.js)

const request = require("request");const fs = require("fs");const mkdirp = require("mkdirp");const path = require("path");const utilities = require("./utilities");function saveFile(filename, contents, callback) {  mkdirp(path.dirname(filename), err => {    if (err) {      return callback(err);    }    fs.writeFile(filename, contents, callback);  });}function download(url, filename, callback) {  console.log(`Downloading ${url}`);  request(url, (err, response, body) => {    if (err) {      return callback(err);    }    saveFile(filename, body, err => {      if (err) {        return callback(err);      }      console.log(`Downloaded and saved: ${url}`);      callback(null, body);    });  })}/// 最大的啟發(fā)是實(shí)現(xiàn)了如何異步循環(huán)遍歷數(shù)組function spiderLinks(currentUrl, body, nesting, callback) {  if (nesting === 0) {    return process.nextTick(callback);  }  const links = utilities.getPageLinks(currentUrl, body);  if (links.length === 0) {    return process.nextTick(callback);  }  let completed = 0, hasErrors = false;  function done(err) {    if (err) {      hasErrors = true;      return callback(err);    }    if (++completed === links.length && !hasErrors) {      return callback();    }  }  links.forEach(link => {    spider(link, nesting - 1, done);  });}const spidering = new Map();function spider(url, nesting, callback) {  if (spidering.has(url)) {    return process.nextTick(callback);  }  spidering.set(url, true);  const filename = utilities.urlToFilename(url);  /// In this pattern, there will be some issues.  /// Possible problems to download the same url again and again。  fs.readFile(filename, "utf8", (err, body) => {    if (err) {      if (err.code !== 'ENOENT') {        return callback(err);      }      return download(url, filename, (err, body) => {        if (err) {          return callback(err);        }        spiderLinks(url, body, nesting, callback);      });    }    spiderLinks(url, body, nesting, callback);  });}spider(process.argv[2], 2, (err, filename, downloaded) => {  if (err) {    console.log(err);  } else if (downloaded) {    console.log(`Completed the download of ${filename}`);  } else {    console.log(`${filename} was already downloaded`);  }});

這段代碼同樣很簡(jiǎn)單,也有兩個(gè)核心內(nèi)容。一個(gè)是如何實(shí)現(xiàn)并發(fā):

/// 最大的啟發(fā)是實(shí)現(xiàn)了如何異步循環(huán)遍歷數(shù)組function spiderLinks(currentUrl, body, nesting, callback) {  if (nesting === 0) {    return process.nextTick(callback);  }  const links = utilities.getPageLinks(currentUrl, body);  if (links.length === 0) {    return process.nextTick(callback);  }  let completed = 0, hasErrors = false;  function done(err) {    if (err) {      hasErrors = true;      return callback(err);    }    if (++completed === links.length && !hasErrors) {      return callback();    }  }  links.forEach(link => {    spider(link, nesting - 1, done);  });}

上邊的代碼可以說是實(shí)現(xiàn)并發(fā)的一個(gè)pattern。利用循環(huán)遍歷來實(shí)現(xiàn)。另一個(gè)核心是,既然是并發(fā)的,那么利用 fs.exists 就會(huì)存在問題,可能會(huì)重復(fù)下載同一文件,這里的解決方案是:

  • 使用Map緩存某一url,url應(yīng)該作為key

現(xiàn)在我們又有了新的需求,要求限制同時(shí)并發(fā)的最大數(shù),那么在這里就引進(jìn)了一個(gè)我認(rèn)為最重要的概念:隊(duì)列。

(task-Queue.js)

class TaskQueue {  constructor(concurrency) {    this.concurrency = concurrency;    this.running = 0;    this.queue = [];  }  pushTask(task) {    this.queue.push(task);    this.next();  }  next() {    while (this.running < this.concurrency && this.queue.length) {      const task = this.queue.shift();      task(() => {        this.running--;        this.next();      });      this.running++;    }  }}module.exports = TaskQueue;

上邊的代碼就是隊(duì)列的實(shí)現(xiàn)代碼,核心是 next() 方法,可以看出,當(dāng)task加入隊(duì)列中后,會(huì)立刻執(zhí)行,這不是說這個(gè)任務(wù)一定馬上執(zhí)行,而是指的是next會(huì)立刻調(diào)用。

(spider_v5.js)

const request = require("request");const fs = require("fs");const mkdirp = require("mkdirp");const path = require("path");const utilities = require("./utilities");const TaskQueue = require("./task-Queue");const downloadQueue = new TaskQueue(2);function saveFile(filename, contents, callback) {  mkdirp(path.dirname(filename), err => {    if (err) {      return callback(err);    }    fs.writeFile(filename, contents, callback);  });}function download(url, filename, callback) {  console.log(`Downloading ${url}`);  request(url, (err, response, body) => {    if (err) {      return callback(err);    }    saveFile(filename, body, err => {      if (err) {        return callback(err);      }      console.log(`Downloaded and saved: ${url}`);      callback(null, body);    });  })}/// 最大的啟發(fā)是實(shí)現(xiàn)了如何異步循環(huán)遍歷數(shù)組function spiderLinks(currentUrl, body, nesting, callback) {  if (nesting === 0) {    return process.nextTick(callback);  }  const links = utilities.getPageLinks(currentUrl, body);  if (links.length === 0) {    return process.nextTick(callback);  }  let completed = 0, hasErrors = false;  links.forEach(link => {    /// 給隊(duì)列出傳遞一個(gè)任務(wù),這個(gè)任務(wù)首先是一個(gè)函數(shù),其次該函數(shù)接受一個(gè)參數(shù)    /// 當(dāng)調(diào)用任務(wù)時(shí),觸發(fā)該函數(shù),然后給函數(shù)傳遞一個(gè)參數(shù),告訴該函數(shù)在任務(wù)結(jié)束時(shí)干什么    downloadQueue.pushTask(done => {      spider(link, nesting - 1, err => {        /// 這里表示,只要發(fā)生錯(cuò)誤,隊(duì)列就會(huì)退出        if (err) {          hasErrors = true;          return callback(err);        }        if (++completed === links.length && !hasErrors) {          callback();        }        done();      });    });  });}const spidering = new Map();function spider(url, nesting, callback) {  if (spidering.has(url)) {    return process.nextTick(callback);  }  spidering.set(url, true);  const filename = utilities.urlToFilename(url);  /// In this pattern, there will be some issues.  /// Possible problems to download the same url again and again。  fs.readFile(filename, "utf8", (err, body) => {    if (err) {      if (err.code !== 'ENOENT') {        return callback(err);      }      return download(url, filename, (err, body) => {        if (err) {          return callback(err);        }        spiderLinks(url, body, nesting, callback);      });    }    spiderLinks(url, body, nesting, callback);  });}spider(process.argv[2], 2, (err, filename, downloaded) => {  if (err) {    console.log(`error: ${err}`);  } else if (downloaded) {    console.log(`Completed the download of ${filename}`);  } else {    console.log(`${filename} was already downloaded`);  }});

因此,為了限制并發(fā)的個(gè)數(shù),只需在 spiderLinks 方法中,把task遍歷放入隊(duì)列就可以了。這相對(duì)來說很簡(jiǎn)單。

到這里為止,我們使用原生JavaScript實(shí)現(xiàn)了一個(gè)有相對(duì)完整功能的網(wǎng)絡(luò)蜘蛛,既能串行,也能并發(fā),還可以控制并發(fā)個(gè)數(shù)。

2.使用async庫

把不同的功能放到不同的函數(shù)中,會(huì)給我們帶來巨大的好處,async庫十分流行,它的性能也不錯(cuò),它內(nèi)部基于callback。

(spider_v6.js)

const request = require("request");const fs = require("fs");const mkdirp = require("mkdirp");const path = require("path");const utilities = require("./utilities");const series = require("async/series");const eachSeries = require("async/eachSeries");function download(url, filename, callback) {  console.log(`Downloading ${url}`);  let body;  series([    callback => {      request(url, (err, response, resBody) => {        if (err) {          return callback(err);        }        body = resBody;        callback();      });    },    mkdirp.bind(null, path.dirname(filename)),    callback => {      fs.writeFile(filename, body, callback);    }  ], err => {    if (err) {      return callback(err);    }    console.log(`Downloaded and saved: ${url}`);    callback(null, body);  });}/// 最大的啟發(fā)是實(shí)現(xiàn)了如何異步循環(huán)遍歷數(shù)組function spiderLinks(currentUrl, body, nesting, callback) {  if (nesting === 0) {    return process.nextTick(callback);  }  const links = utilities.getPageLinks(currentUrl, body);  if (links.length === 0) {    return process.nextTick(callback);  }  eachSeries(links, (link, cb) => {    "use strict";    spider(link, nesting - 1, cb);  }, callback);}const spidering = new Map();function spider(url, nesting, callback) {  if (spidering.has(url)) {    return process.nextTick(callback);  }  spidering.set(url, true);  const filename = utilities.urlToFilename(url);  fs.readFile(filename, "utf8", (err, body) => {    if (err) {      if (err.code !== 'ENOENT') {        return callback(err);      }      return download(url, filename, (err, body) => {        if (err) {          return callback(err);        }        spiderLinks(url, body, nesting, callback);      });    }    spiderLinks(url, body, nesting, callback);  });}spider(process.argv[2], 1, (err, filename, downloaded) => {  if (err) {    console.log(err);  } else if (downloaded) {    console.log(`Completed the download of ${filename}`);  } else {    console.log(`${filename} was already downloaded`);  }});

在上邊的代碼中,我們只使用了async的三個(gè)功能:

const series = require("async/series"); // 串行const eachSeries = require("async/eachSeries"); // 并行const queue = require("async/queue"); // 隊(duì)列

由于比較簡(jiǎn)單,就不做解釋了。async中的隊(duì)列的代碼在(spider_v7.js)中,和上邊我們自定義的隊(duì)列很相似,也不做更多解釋了。

3.Promise

Promise是一個(gè)協(xié)議,有很多庫實(shí)現(xiàn)了這個(gè)協(xié)議,我們用的是ES6的實(shí)現(xiàn)。簡(jiǎn)單來說promise就是一個(gè)約定,如果完成了,就調(diào)用它的resolve方法,失敗了就調(diào)用它的reject方法。它內(nèi)有實(shí)現(xiàn)了then方法,then返回promise本身,這樣就形成了調(diào)用鏈。

其實(shí)Promise的內(nèi)容有很多,在實(shí)際應(yīng)用中是如何把普通的函數(shù)promise化。這方面的內(nèi)容在這里也不講了,我自己也不夠格

(spider_v8.js)

const utilities = require("./utilities");const request = utilities.promisify(require("request"));const fs = require("fs");const readFile = utilities.promisify(fs.readFile);const writeFile = utilities.promisify(fs.writeFile);const mkdirp = utilities.promisify(require("mkdirp"));const path = require("path");function saveFile(filename, contents, callback) {  mkdirp(path.dirname(filename), err => {    if (err) {      return callback(err);    }    fs.writeFile(filename, contents, callback);  });}function download(url, filename) {  console.log(`Downloading ${url}`);  let body;  return request(url)    .then(response => {      "use strict";      body = response.body;      return mkdirp(path.dirname(filename));    })    .then(() => writeFile(filename, body))    .then(() => {      "use strict";      console.log(`Downloaded adn saved: ${url}`);      return body;    });}/// promise編程的本質(zhì)就是為了解決在函數(shù)中設(shè)置回調(diào)函數(shù)的問題/// 通過中間層promise來實(shí)現(xiàn)異步函數(shù)同步化function spiderLinks(currentUrl, body, nesting) {  let promise = Promise.resolve();  if (nesting === 0) {    return promise;  }  const links = utilities.getPageLinks(currentUrl, body);  links.forEach(link => {    "use strict";    promise = promise.then(() => spider(link, nesting - 1));  });  return promise;}function spider(url, nesting) {  const filename = utilities.urlToFilename(url);  return readFile(filename, "utf8")    .then(      body => spiderLinks(url, body, nesting),      err => {        "use strict";        if (err.code !== 'ENOENT') {          /// 拋出錯(cuò)誤,這個(gè)方便與在整個(gè)異步鏈的最后通過呢catch來捕獲這個(gè)鏈中的錯(cuò)誤          throw err;        }        return download(url, filename)          .then(body => spiderLinks(url, body, nesting));      }    );}spider(process.argv[2], 1)  .then(() => {    "use strict";    console.log('Download complete');  })  .catch(err => {    "use strict";    console.log(err);  });

可以看到上邊的代碼中的函數(shù)都是沒有callback的,只需要在最后catch就可以了。

在設(shè)計(jì)api的時(shí)候,應(yīng)該支持兩種方式,及支持callback,又支持promise

function asyncDivision(dividend, divisor, cb) {  return new Promise((resolve, reject) => {    "use strict";    process.nextTick(() => {      const result = dividend / divisor;      if (isNaN(result) || !Number.isFinite(result)) {        const error = new Error("Invalid operands");        if (cb) {          cb(error);        }        return reject(error);      }      if (cb) {        cb(null, result);      }      resolve(result);    });  });}asyncDivision(10, 2, (err, result) => {  "use strict";  if (err) {    return console.log(err);  }  console.log(result);});asyncDivision(22, 11)  .then((result) => console.log(result))  .catch((err) => console.log(err));

4.Generator

Generator很有意思,他可以讓暫停函數(shù)和恢復(fù)函數(shù),利用thunkify和co這兩個(gè)庫,我們下邊的代碼實(shí)現(xiàn)起來非常酷。

(spider_v9.js)

const thunkify = require("thunkify");const co = require("co");const path = require("path");const utilities = require("./utilities");const request = thunkify(require("request"));const fs = require("fs");const mkdirp = thunkify(require("mkdirp"));const readFile = thunkify(fs.readFile);const writeFile = thunkify(fs.writeFile);const nextTick = thunkify(process.nextTick);function* download(url, filename) {  console.log(`Downloading ${url}`);  const response = yield request(url);  console.log(response);  const body = response[1];  yield mkdirp(path.dirname(filename));  yield writeFile(filename, body);  console.log(`Downloaded and saved ${url}`);  return body;}function* spider(url, nesting) {  const filename = utilities.urlToFilename(url);  let body;  try {    body = yield readFile(filename, "utf8");  } catch (err) {    if (err.code !== 'ENOENT') {      throw err;    }    body = yield download(url, filename);  }  yield spiderLinks(url, body, nesting);}function* spiderLinks(currentUrl, body, nesting) {  if (nesting === 0) {    return nextTick();  }  const links = utilities.getPageLinks(currentUrl, body);  for (let i = 0; i < links.length; i++) {    yield spider(links[i], nesting - 1);  }}/// 通過co就自動(dòng)處理了回調(diào)函數(shù),直接返回了回調(diào)函數(shù)中的參數(shù),把這些參數(shù)放到一個(gè)數(shù)組中,但是去掉了err信息co(function* () {  try {    yield spider(process.argv[2], 1);    console.log('Download complete');  } catch (err) {    console.log(err);  }});

總結(jié)

我并沒有寫promise和generator并發(fā)的代碼。以上這些內(nèi)容來自于這本書nodejs-design-patterns

demo下載

以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持武林網(wǎng)。

發(fā)表評(píng)論 共有條評(píng)論
用戶名: 密碼:
驗(yàn)證碼: 匿名發(fā)表
主站蜘蛛池模板: 达拉特旗| 年辖:市辖区| 沾化县| 济宁市| 噶尔县| 钟山县| 东辽县| 康马县| 海城市| 西昌市| 乌兰察布市| 长沙市| 承德县| 洛宁县| 沅陵县| 镇巴县| 大城县| 吉木萨尔县| 莆田市| 桑植县| 华宁县| 台南市| 金川县| 焦作市| 修武县| 汽车| 长泰县| 德庆县| 嫩江县| 墨玉县| 长泰县| 布拖县| 门源| 焦作市| 景泰县| 靖宇县| 策勒县| 中牟县| 高清| 南华县| 广东省|