十七:H264VideoStreamParser詳解
很多人要做實時H264 RTP傳輸,那么如何充分利用live555來做呢?大家可以看到現有的H264VideoFileServerMediaSubsession中,sink使用了H264VideoRTPSink,source使用了H264VideoStreamFramer,然而這個連接是很復雜的,在這兩個節點間要插入了很多其它的節點,其實際情況是這樣的:ByteStreamFileSource-->H264VideoStreamParser-->H264VideoStreamFramer-->H264FUAFragmenter-->H264VideoRTPSink.哇!真的這么復雜嗎?一點沒錯!當然你可以不用理它們的來龍去脈,你只需自己實現一個source,能采集圖像并進行h264編碼的source(當然你可以用CPU也可以用DSP進行編碼),然后用它替代ByteStreamFileSource,就成了,比如你這個source可以叫做H264ByteStreamSource.當然為了提高效率,采集和編碼部分應放在另一個線程中執行.然而,我還是很想了解H264VideoStreamParser到底是什么,Parser到底有什么用?它做了什么?它與H264VideoStreamFramer是如何配合的?它們之間有內存copy發生嗎?先設想一個問題:H264VideoStreamFramer是什么角色?跟據H264VideoFileServerMediaSubsession的代碼,H264VideoStreamFramer是真正代表source的,Sink所面對的Source就是它.但是它又連接了一個ByteStreamFileSource.look一下這部分代碼:
[cpp] view plaincopyFramedSource* H264VideoFileServerMediaSubsession:: createNewStreamSource(unsigned /*clientSessionId*/, unsigned& estBitrate) { estBitrate = 500; // kbps, estimate // Create the video source: ByteStreamFileSource* fileSource = ByteStreamFileSource::createNew(envir(), fFileName); if (fileSource == NULL) return NULL; fFileSize = fileSource->fileSize(); // Create a framer for the Video Elementary Stream: return H264VideoStreamFramer::createNew(envir(), fileSource); } [cpp] view plain copyFramedSource* H264VideoFileServerMediaSubsession:: createNewStreamSource(unsigned /*clientSessionId*/, unsigned& estBitrate) { estBitrate = 500; // kbps, estimate // Create the video source: ByteStreamFileSource* fileSource = ByteStreamFileSource::createNew(envir(), fFileName); if (fileSource == NULL) return NULL; fFileSize = fileSource->fileSize(); // Create a framer for the Video Elementary Stream: return H264VideoStreamFramer::createNew(envir(), fileSource); }是吧?我沒有忽悠吧?ByteStreamFileSource是從文件取得數據的,它不管是到底什么媒體格式,它只是讀文件.所以很明顯H264VideoStreamFramer利用ByteStreamFileSource從文件取得數據,然后H264VideoStreamFramer再對數據進行分析.比如找出每個NALU,然后傳給Sink.但是H264VideoStreamFramer沒有自己去分析,而是利用了Parser,所以那一串中就多了一個H264VideoStreamParser.H264VideoStreamParser擁有兩個source指針,一個是FramedSource* fInputSource,另一個是H264VideoStreamFramer* fUsingSource.可以看出,H264VideoStreamParser把fInputSource和fUsingSource串了起來,那么fInputSource就是ByteStreamFileSource.我們想像一下H264VideoStreamParser的所作所為:H264VideoStreamFramer把自己的緩沖(其實是sink的)傳給H264VideoStreamParser,每當H264VideoStreamFramer要獲取一個NALU時,就跟H264VideoStreamParser要,H264VideoStreamParser就從ByteStreamFileSource讀一坨數據,然后進行分析,如果取得了一個NALU,就傳給H264VideoStreamFramer.唉,H264VideoStreamFramer真是個不勞而獲的壞家伙!看一下實際的流程:
[cpp] view plaincopy//Sink調用Source(H264VideoStreamFramer)的GetNextFrame()獲取數據, //H264VideoStreamFramer從MPEGVideoStreamFramer派生,所以下面的函數會被調用: void MPEGVideoStreamFramer::doGetNextFrame() { fParser->registerReadInterest(fTo, fMaxSize); continueReadPRocessing(); } void MPEGVideoStreamFramer::continueReadProcessing(void* clientData, unsigned char* /*ptr*/, unsigned /*size*/, struct timeval /*presentationTime*/) { MPEGVideoStreamFramer* framer = (MPEGVideoStreamFramer*) clientData; framer->continueReadProcessing(); } [cpp] view plain copy//Sink調用Source(H264VideoStreamFramer)的GetNextFrame()獲取數據, //H264VideoStreamFramer從MPEGVideoStreamFramer派生,所以下面的函數會被調用: void MPEGVideoStreamFramer::doGetNextFrame() { fParser->registerReadInterest(fTo, fMaxSize); continueReadProcessing(); } void MPEGVideoStreamFramer::continueReadProcessing(void* clientData, unsigned char* /*ptr*/, unsigned /*size*/, struct timeval /*presentationTime*/) { MPEGVideoStreamFramer* framer = (MPEGVideoStreamFramer*) clientData; framer->continueReadProcessing(); } 上兩個是過渡,最終在這里執行:[cpp] view plaincopyvoid MPEGVideoStreamFramer::continueReadProcessing() { //調用Parser的parser()分析出一個NALU.如果得到了一個NALU,則 //用afterGetting(this)返回給Sink. unsigned acquiredFrameSize = fParser->parse(); if (acquiredFrameSize > 0) { // We were able to acquire a frame from the input. // It has already been copied to the reader's space. fFrameSize = acquiredFrameSize; fNumTruncatedBytes = fParser->numTruncatedBytes(); // "fPresentationTime" should have already been computed. // Compute "fDurationInMicroseconds" now: fDurationInMicroseconds = (fFrameRate == 0.0 || ((int) fPictureCount) < 0) ? 0 : (unsigned) ((fPictureCount * 1000000) / fFrameRate); fPictureCount = 0; // Call our own 'after getting' function. Because we're not a 'leaf' // source, we can call this directly, without risking infinite recursion. afterGetting(this); } else { //執行到此處并不代表parser()中沒有取得數據!! // We were unable to parse a complete frame from the input, because: // - we had to read more data from the source stream, or // - the source stream has ended. } } [cpp] view plain copyvoid MPEGVideoStreamFramer::continueReadProcessing() { //調用Parser的parser()分析出一個NALU.如果得到了一個NALU,則 //用afterGetting(this)返回給Sink. unsigned acquiredFrameSize = fParser->parse(); if (acquiredFrameSize > 0) { // We were able to acquire a frame from the input. // It has already been copied to the reader's space. fFrameSize = acquiredFrameSize; fNumTruncatedBytes = fParser->numTruncatedBytes(); // "fPresentationTime" should have already been computed. // Compute "fDurationInMicroseconds" now: fDurationInMicroseconds = (fFrameRate == 0.0 || ((int) fPictureCount) < 0) ? 0 : (unsigned) ((fPictureCount * 1000000) / fFrameRate); fPictureCount = 0; // Call our own 'after getting' function. Because we're not a 'leaf' // source, we can call this directly, without risking infinite recursion. afterGetting(this); } else { //執行到此處并不代表parser()中沒有取得數據!! // We were unable to parse a complete frame from the input, because: // - we had to read more data from the source stream, or // - the source stream has ended. } } 上面這個函數的else{}中的注釋大家注意了沒有?這里關連到一個很難搞懂的現象,后面會解釋之.這里先看一下parser()函數是怎樣取得數據并進行分析的.parser()中讀新數據是由那些test4Bytes(),skipBytes()之類的函數引起的,它們都最終調用了ensureValidBytes1():[cpp] view plaincopyvoid StreamParser::ensureValidBytes1(unsigned numBytesNeeded) { // We need to read some more bytes from the input source. // First, clarify how much data to ask for: unsigned maxInputFrameSize = fInputSource->maxFrameSize(); if (maxInputFrameSize > numBytesNeeded) numBytesNeeded = maxInputFrameSize; // First, check whether these new bytes would overflow the current // bank. If so, start using a new bank now. if (fCurParserIndex + numBytesNeeded > BANK_SIZE) { // Swap banks, but save any still-needed bytes from the old bank: unsigned numBytesToSave = fTotNumValidBytes - fSavedParserIndex; unsigned char const* from = &curBank()[fSavedParserIndex]; fCurBankNum = (fCurBankNum + 1) % 2; fCurBank = fBank[fCurBankNum]; memmove(curBank(), from, numBytesToSave); fCurParserIndex = fCurParserIndex - fSavedParserIndex; fSavedParserIndex = 0; fTotNumValidBytes = numBytesToSave; } // ASSERT: fCurParserIndex + numBytesNeeded > fTotNumValidBytes // && fCurParserIndex + numBytesNeeded <= BANK_SIZE if (fCurParserIndex + numBytesNeeded > BANK_SIZE) { // If this happens, it means that we have too much saved parser state. // To fix this, increase BANK_SIZE as appropriate. fInputSource->envir() << "StreamParser internal error (" << fCurParserIndex << "+ " << numBytesNeeded << " > " << BANK_SIZE << ")/n"; fInputSource->envir().internalError(); } // Try to read as many new bytes as will fit in the current bank: unsigned maxNumBytesToRead = BANK_SIZE - fTotNumValidBytes; fInputSource->getNextFrame(&curBank()[fTotNumValidBytes], maxNumBytesToRead, afterGettingBytes, this, onInputClosure, this); throw NO_MORE_BUFFERED_INPUT; } [cpp] view plain copyvoid StreamParser::ensureValidBytes1(unsigned numBytesNeeded) { // We need to read some more bytes from the input source. // First, clarify how much data to ask for: unsigned maxInputFrameSize = fInputSource->maxFrameSize(); if (maxInputFrameSize > numBytesNeeded) numBytesNeeded = maxInputFrameSize; // First, check whether these new bytes would overflow the current // bank. If so, start using a new bank now. if (fCurParserIndex + numBytesNeeded > BANK_SIZE) { // Swap banks, but save any still-needed bytes from the old bank: unsigned numBytesToSave = fTotNumValidBytes - fSavedParserIndex; unsigned char const* from = &curBank()[fSavedParserIndex]; fCurBankNum = (fCurBankNum + 1) % 2; fCurBank = fBank[fCurBankNum]; memmove(curBank(), from, numBytesToSave); fCurParserIndex = fCurParserIndex - fSavedParserIndex; fSavedParserIndex = 0; fTotNumValidBytes = numBytesToSave; } // ASSERT: fCurParserIndex + numBytesNeeded > fTotNumValidBytes // && fCurParserIndex + numBytesNeeded <= BANK_SIZE if (fCurParserIndex + numBytesNeeded > BANK_SIZE) { // If this happens, it means that we have too much saved parser state. // To fix this, increase BANK_SIZE as appropriate. fInputSource->envir() << "StreamParser internal error (" << fCurParserIndex << "+ " << numBytesNeeded << " > " << BANK_SIZE << ")/n"; fInputSource->envir().internalError(); } // Try to read as many new bytes as will fit in the current bank: unsigned maxNumBytesToRead = BANK_SIZE - fTotNumValidBytes; fInputSource->getNextFrame(&curBank()[fTotNumValidBytes], maxNumBytesToRead, afterGettingBytes, this, onInputClosure, this); throw NO_MORE_BUFFERED_INPUT; } 可以看到一個奇怪的現象:這個函數沒有返回值,但最終拋出了一個異常,而且只要執行這個函數,就會拋出這個異常.還是先分析一下這個函數做了什么吧:首先判斷自己的緩沖區是否能容納所需的數據量,如果實在不能,也只能提示一下,最后從ByteStreamFileSource獲取一坨數據.curBack()返回的就是Parser自己的緩沖.而afterGettingBytes這個回調函數是H264VideoStreamFramer傳入的,所以獲取數據之后會執行H264VideoStreamFramer的函數,中轉幾下后,最終執行的就是上面的void MPEGVideoStreamFramer::continueReadProcessing().哇,看到了一個問題:Parser()中嵌套執行Parser()!而第二次執行Parser()完成后,返回到ensureValidBytes1(),然后由于拋出異常而退出,退出到哪里了呢?退回到上次調用的Parser()中了,因為Parser()中寫了try{}catch{}.catch{}中的代碼如下:[cpp] view plaincopy catch (int /*e*/) { #ifdef DEBUG fprintf(stderr, "H264VideoStreamParser::parse() EXCEPTION (This is normal behavior - *not* an error)/n"); #endif return 0; // the parsing got interrupted } [cpp] view plain copy catch (int /*e*/) { #ifdef DEBUG fprintf(stderr, "H264VideoStreamParser::parse() EXCEPTION (This is normal behavior - *not* an error)/n"); #endif return 0; // the parsing got interrupted } 可見parser()此時返回0,parser()返回0就執行到MPEGVideoStreamFramer::continueReadProcessing()中的else{}部分了,回去看看吧,其實啥也沒做.也就是說,第一次調用parser()時,它只是從ByteStreamFileSource獲取數據,那么這個parser()獲取數據后什么也不做,但實際上對NALU分析和處里在這次Parser()的調用中已經完成了,不是在它本身完成的,而是在它引起了parser()的嵌套調用中完成.好迷糊,理順一下過程就知道了:sink要獲取數據,執行到MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing調用parser(),parser()要使用數據時發現沒有,于是ensureValidBytes1()被調用來從ByteStreamFileSource獲取數據,取得數據后MPEGVideoStreamFramer::afterGettingBytes()被調用,并中轉到MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing()被嵌套調用!,MPEGVideoStreamFramer::continueReadProcessing()中又會調用parser(),此時parser()要使用數據時發現有數據了,所以就進行分析,分析出一個NALU后,返回到MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing()會調用afterGetting(this)把數據返回給sink.sink處理完數據后返回到MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing()再返回到ensureValidBytes1(),ensureValidBytes1()拋出異常返回到第一次被調用的parser()的catch{}中,parser()返回到第一次調用的MPEGVideoStreamFramer::continueReadProcessing()中,MPEGVideoStreamFramer::continueReadProcessing()發現parser()沒有取得NALU,于是啥也不做,返回到sink中,sink會繼續通過source->getNextFrame()->MPEGVideoStreamFramer::continueReadProcessing()...這樣再次獲取NALU.好曲折離奇的故事!不過終于講完了!可以看到,parser中是有自己的緩沖的,而且其大小是固定的:#define BANK_SIZE 150000你自己寫Source時,每次輸出的是一幀數據,包含多個NALU,所以你只要確定你的一幀不超過150000字節,你就可以放心的往fTo中copy,如果你的幀太大,就改這個宏吧.在此公布一下live555 QQ群號: 224847583.歡迎研究流媒體的人類加入,在其中可以討論流媒體相關的任何東西,live555,ffmpeg,vlc,rtmp等等....新聞熱點
疑難解答