我知道通過vacuum可以把一個(gè)表的內(nèi)容整理一下,使磁盤空間的占用變小,但pgsql能不能自動(dòng)利用表中的空閑空間呢?
例如我做了這么一個(gè)測(cè)試:
在一個(gè)表里insert十萬條記錄,記下它占用的磁盤空間,比如說是1000k,再刪除其中的5萬條記錄,接著再往里插5萬條記錄,這時(shí)這個(gè)表的磁盤空間不是接近于先前看到的1000k,而是變成了1500k,怎么會(huì)是這樣呢?刪掉的那5萬條記錄仍然會(huì)一直占著空間,等到我來做vacuum嗎?
netkiller 回復(fù)于:2004-07-07 11:15:31
那我建議你去用oracle,sqlserver,sybase
也做這樣的操作.你看看他們?cè)龃蟛唬浚浚?/p>
你以為是方本文件啊.刪了在加上.大小還一樣..
里有要日志,rollback段,......
qjlemon 回復(fù)于:2004-07-07 18:27:00
班長(zhǎng)兄弟請(qǐng)別發(fā)火:)這個(gè)問題我是真感到比較困惑,所以來這里請(qǐng)教解決方法的。oracle下我做過試驗(yàn)的,我當(dāng)然沒有指望哪個(gè)數(shù)據(jù)庫能在一個(gè)固定大小的文件里無限制地放數(shù)據(jù),這個(gè)是最起碼的常識(shí)了。
這個(gè)問題我后來自已找到答案了,還是要用vacuum來搞定,與max_fsm_pages這個(gè)配置的大小有關(guān),看來這個(gè)問題太菜了,所以我也不在這里貼了:em10:
netkiller 回復(fù)于:2004-07-08 09:49:36
哈哈。我以前用sqlserver就發(fā)現(xiàn)在天天長(zhǎng)大,長(zhǎng)啊長(zhǎng)啊。。我都沒辦法。最后得裝:)
netkiller 回復(fù)于:2004-07-08 09:50:28
你可詳細(xì)介紹一下
max_fsm_pages
你的這次操作經(jīng)驗(yàn)給大家:)
qjlemon 回復(fù)于:2004-07-08 13:58:12
響應(yīng)班長(zhǎng)號(hào)召:em11:
在很久很久以前。。。。。在一篇名為<<tuningpostgresqlforperformance>>的文章,其中有這么一段:
|||max_fsm_pages:
postgresqlrecordsfreespaceineachofitsdatapages.thisinformationisusefulforvacuumtofindouthowmanyandwhichpagestolookforwhenitfreesupthespace.
ifyouhaveadatabasethatdoeslotsofupdatesanddeletes,thatisgoingtogeneratedeadtuples,duetopostgresql'smvccsystem.thespaceoccupiedbydeadtuplescanbefreedwithvacuum,unlessthereismorewastedspacethaniscoveredbythefreespacemap,inwhichcasethemuchlessconvenient"vacuumfull"isrequired.byexpandingthefsmtocoverallofthosedeadtuples,youmightneveragainneedtorunvacuumfullexceptonholidays.
thebestwaytosetmax_fsm_pagesisinteractive;first,figureoutthevacuum(regular)frequencyofyourdatabasebasedonwriteactivity;next,runthedatabaseundernormalproductionload,andrun"vacuumverboseanalyze"insteadofvacuum,savingtheoutputtoafile;finally,calculatethemaximumtotalnumberofpagesreclaimedbetweenvacuumsbasedontheoutput,andusethat.
remember,thisisadatabaseclusterwidesetting.sobumpitupenoughtocoveralldatabasesinyourdatabasecluster.also,eachfsmpageuses6bytesoframforadministrativeoverhead,soincreasingfsmsubstantiallyonsystemslowonrammaybecounter-productive.
大意是說,postgresql.conf中的這個(gè)參數(shù)(max_fsm_pages)用于告訴postgresql申請(qǐng)多大的內(nèi)存空間用于保存數(shù)據(jù)文件的freespace信息,按我的簡(jiǎn)單理解,如果在一個(gè)表中刪除了一些記錄,postgresql會(huì)把這一改動(dòng)記錄在"freespacemap"中,下次如果再往表里插記錄時(shí),根據(jù)freespacemap中的信息,就能利用以前刪記錄而騰出來的磁盤空間。不過freespacemap是存在于內(nèi)存中,大小畢竟是有限的,對(duì)于大量數(shù)據(jù)的刪除+插入,要么指定一個(gè)較大的max_fsm_pages,要么及時(shí)進(jìn)行vacuum以整理表中的碎片,否則,postgresql只有把新插入的記錄添加到文件的末尾,造成文件越來越大。我的一個(gè)程序就是意外地因?yàn)榇疟P空間滿了而中止的,它每次要往一個(gè)表里插500多萬條記錄,這之前先要delete同樣條數(shù)的一批記錄,可最后還是占滿了整個(gè)硬盤。
|||我覺得postgresql的這種工作方式有它的一個(gè)好處,就是如果內(nèi)存足夠大,可以指定一個(gè)很大的freespacemap,對(duì)于oltp型的應(yīng)用,可能會(huì)大幅提高性能(猜測(cè),沒有驗(yàn)證過),另外用戶可以自已選擇在合適的時(shí)候進(jìn)行vacuum或vacuumfull,如果你確信一個(gè)表只會(huì)往里插記錄(如記錄操作日志),對(duì)這個(gè)表就可以永遠(yuǎn)不進(jìn)行vacuumfull,是不是很靈活?
不過,使用vacuumfull大量移動(dòng)數(shù)據(jù)畢竟是件很耗時(shí)的工作,在此期間數(shù)據(jù)庫性能會(huì)嚴(yán)重下降,大概這就是“靈活”的代價(jià)了。在這方面,oracle的block->extent->segment這種復(fù)雜的機(jī)制可能更有效一些吧。據(jù)說postgresql將引入表空間的概念了,值得期待啊!
至于freespacemap設(shè)多大,上面的文章教了個(gè)辦法,照著做就行了,只是需要弄明白,這畢竟是一個(gè)“map”,如果打算刪掉300m的記錄,freespacemap并不需要申請(qǐng)300m喔:d
qjlemon 回復(fù)于:2004-07-08 14:09:24
tuningpostgresqlforperformance
shridhardaithankar,joshberkus
july3,2003copyright2003shridhardaithankarandjoshberkus.
authorizedforre-distributiononlyunderthepostgresqllicense(seewww.postgresql.org/license).
tableofcontents
1introduction
2somebasicparameters
2.1sharedbuffers
2.2sortmemory
2.3effectivecachesize
2.4fsyncandthewalfiles
3somelessknownparameters
3.1random_page_cost
3.2vacuum_mem
3.3max_fsm_pages
3.4max fsm_relations
3.5wal_buffers
4othertips
4.1checkyourfilesystem
4.2trytheautovacuumdaemon
|||4.3tryfreebsd
5theconfsettingguide
1introduction
thisisaquickstartguidefortuningpostgresql'ssettingsforperformance.thisassumesminimalfamiliaritywithpostgresqladministration.inparticular,oneshouldknow,
howtostartandstopthepostmasterservice
howtotuneosparameters
howtotestthechanges
italsoassumesthatyouhavegonethroughthepostgresqladministrationmanualbeforestarting,andtohavesetupyourpostgresqlserverwithatleastthedefaultconfiguration.
therearetwoimportantthingsforanyperformanceoptimization:
decidewhatlevelofperformanceyouwant
ifyoudon'tknowyourexpectedlevelofperformance,youwillendupchasingacarrotalwayscoupleofmetersaheadofyou.theperformancetuningmeasuresgivediminishingreturnsafteracertainthreshold.ifyoudon'tsetthisthresholdbeforehand,youwillendupspendinglotoftimeforminusculegains.
knowyourload
thisdocumentfocusesentirelytuningpostgresql.confbestforyourexistingsetup.thisisnottheendofperformancetuning.afterusingthisdocumenttoextractthemaximumreasonableperformancefromyourhardware,youshouldstartoptimizingyourapplicationforefficientdataaccess,whichisbeyondthescopeofthisarticle.
pleasealsonotethatthetuningadvicesdescribedherearehints.youshouldnotimplementthemallblindly.tuneoneparameteratatimeandtestitsimpactanddecidewhetherornotyouneedmoretuning.testingandbenchmarkingisanintegralpartofdatabasetuning.
|||注冊(cè)會(huì)員,創(chuàng)建你的web開發(fā)資料庫,tuningthesoftwaresettingsexploredinthisarticleisonlyaboutone-thirdofdatabaseperformancetuning,butit'sagoodstartsinceyoucanexperimentwithsomebasicsettingchangesinanafternoon,whereassomeotheraspectsoftuningcanbeverytime-consuming.theothertwo-thirdsofdatabaseapplicationtuningare:
hardwareselectionandsetup
databasesareveryboundtoyoursystem'si/o(disk)accessandmemoryusage.assuch,selectionandconfigurationofdisks,raidarrays,ram,operatingsystem,andcompetitionfortheseresourceswillhaveaprofoundeffectonhowfastyourdatabaseis.wehopetohavealaterarticlecoveringthistopic.
efficientapplicationdesign
yourapplicationalsoneedstobedesignedtoaccessdataefficiently,thoughcarefulquerywriting,plannedandtestedindexing,goodconnectionmanagement,andavoidingperformancepitfallsparticulartoyourversionofpostgresql.expectanotherguidesomedayhelpingwiththis,butreallyittakesseverallargebooksandyearsofexperiencetogetitright...orjustalotoftimeonthemailinglists.
2somebasicparameters
2.1sharedbuffers
sharedbuffersdefinesablockofmemorythatpostgresqlwillusetoholdrequeststhatareawaitingattentionfromthekernelbufferandcpu.thedefaultvalueisquitelowforanyrealworldworkloadandneedtobebeefedup.however,unlikedatabaseslikeoracle,moreisnotalwaysbetter.thereisathresholdabovewhichincreasingthisvaluecanhurtperformance.
thisistheareaofmemorypostgresqlactuallyusestoperformwork.itshouldbesufficientenoughtohandleloadondatabaseserver.otherwisepostgresqlwillstartpushingdatatofileanditwillhurttheperformanceoverall.hencethisisthemostimportantsettingoneneedstotuneup.
|||thisvalueshouldbesetbasedonthedatasetsizewhichthedatabaseserverissupposedtohandleatpeakloadsandonyouravailableram(keepinmindthatramusedbyotherapplicationsontheserverisnotavailable).werecommendfollowingruleofthumbforthisparameter:
startat4mb(512)foraworkstation
mediumsizedatasetand256-512mbavailableram:16-32mb(2048-4096)
largedatasetandlotsofavailableram(1-4gb):64-256mb(8192-32768)
pleasenote.postgresqlcountsalotontheostocachedatafilesandhencedoesnotbotherwithduplicatingitsfilecachingeffort.thesharedbuffersparameterassumesthatosisgoingtocachealotoffilesandhenceitisgenerallyverylowcomparedwithsystemram.evenforadatasetinexcessof20gb,asettingof128mbmaybetoomuch,ifyouhaveonly1gbramandanaggressive-at-cachingoslikelinux.
thereisonewaytodecidewhatisbestforyou.setahighvalueofthisparameterandrunthedatabasefortypicalusage.watchusageofsharedmemoryusingipcsorsimilartools.arecommendedfigurewouldbebetween1.2to2timespeaksharedmemoryusage.
2.2sortmemory
thisparametersetsmaximumlimitonmemorythatadatabaseconnectioncanusetoperformsorts.ifyourquerieshaveorder-byorgroup-byclausesthatrequiresortinglargedataset,increasingthisparameterwouldhelp.butbeware:thisparameterispersort,perconnection.thinktwicebeforesettingthisparametertoohighonanydatabasewithmanyusers.arecommendedapproachistosetthisparameterperconnectionasandwhenrequired;thatis,lowformostsimplequeriesandhigherforlarge,complexqueriesanddatadumps.
|||注冊(cè)會(huì)員,創(chuàng)建你的web開發(fā)資料庫,2.3effectivecachesize
thisparameterallowspostgresqltomakebestpossibleuseoframavailableonyourserver.ittellspostgresqlthesizeofosdatacache.sothatpostgresqlcandrawdifferentexecutionplanbasedonthatdata.
saythereis1.5gbraminyourmachine,sharedbuffersaresetto32mbandeffectivecachesizeissetto800mb.soifaqueryneeds700mbofdataset,postgresqlwouldestimatethatallthedatarequiredshouldbeavailableinmemoryandwouldoptformoreaggressiveplanintermsofoptimization,involvingheavierindexusageandmergejoins.butifeffectivecacheissettoonly200mb,thequeryplannerisliabletooptforthemorei/oefficientsequentialscan.
whilesettingthisparametersize,leaveroomforotherapplicationsrunningontheservermachine.theobjectiveistosetthisvalueatthehighestamountoframwhichwillbeavailabletopostgresqlallthetime.
2.4fsyncandthewalfiles
thisparameterssetswhetherornotwritedatatodiskassoonasitiscommitted,whichisdonethroughwriteaheadlogging(wal).ifyoutrustyourhardware,yourpowercompany,andyourbatterypowersupplyenough,yousetthistonoforanimmediateboosttodatawritespeed.butbeveryawarethatanyunexpecteddatabaseshutdownwillforceyoutorestorethedatabasefromyourlastbackup.
ifthat'snotanoptionforyou,youcanstillhavetheprotectionofwalandbetterperformance.simplymoveyourwalfiles,usingeitheramountorasymlinktothepg_xlogdirectory,toaseparatediskorarrayfromyourmaindatabasefiles.inhigh-write-activitydatabases,walshouldhaveitsowndiskorarraytoensurecontinuoushigh-speedaccess.verylargeraidarraysandsan/nasdevicesfrequentlyhandlethisforyouthroughtheirinternalmanagementsystems.
|||3somelessknownparameters
3.1random_page_cost
thisparametersetsthecosttofetcharandomtuplefromthedatabase,whichinfluencestheplanner'schoiceofindexvs.tablescan.thisissettoahighvalueasthedefaultdefaultbasedontheexpectationofslowdiskaccess.ifyouhavereasonablyfastdiskslikescsiorraid,youcanlowerthecostto2.youneedtoexperimenttofindoutwhatworksbestforyoursetupbyrunningavarietyofqueriesandcomparingexecutiontimes.
3.2vacuum_mem
thisparametersetsthememoryallocatedtovacuum.normally,vacuumisadiskintensiveprocess,butraisingthisparameterwillspeeditupbyallowingpostgresqltocopylargerblocksintomemory.justdon'tsetitsohighittakessignificantmemoryawayfromnormaldatabaseoperation.thingsbetween16-32mbshouldbegoodenoughformostsetups.
3.3max_fsm_pages
postgresqlrecordsfreespaceineachofitsdatapages.thisinformationisusefulforvacuumtofindouthowmanyandwhichpagestolookforwhenitfreesupthespace.
ifyouhaveadatabasethatdoeslotsofupdatesanddeletes,thatisgoingtogeneratedeadtuples,duetopostgresql'smvccsystem.thespaceoccupiedbydeadtuplescanbefreedwithvacuum,unlessthereismorewastedspacethaniscoveredbythefreespacemap,inwhichcasethemuchlessconvenient"vacuumfull"isrequired.byexpandingthefsmtocoverallofthosedeadtuples,youmightneveragainneedtorunvacuumfullexceptonholidays.
thebestwaytosetmax_fsm_pagesisinteractive;first,figureoutthevacuum(regular)frequencyofyourdatabasebasedonwriteactivity;next,runthedatabaseundernormalproductionload,andrun"vacuumverboseanalyze"insteadofvacuum,savingtheoutputtoafile;finally,calculatethemaximumtotalnumberofpagesreclaimedbetweenvacuumsbasedontheoutput,andusethat.
|||商業(yè)源碼熱門下載www.html.org.cn
remember,thisisadatabaseclusterwidesetting.sobumpitupenoughtocoveralldatabasesinyourdatabasecluster.also,eachfsmpageuses6bytesoframforadministrativeoverhead,soincreasingfsmsubstantiallyonsystemslowonrammaybecounter-productive.
3.4max_fsm_relations
thissettingdictateshowmanynumberofrelations(tables)willbetrackedinfreespacemap.againthisisadatabasecluster-widesetting,sosetitaccordingly.inversion7.3.3andlater,thisparametershouldbesetcorrectlyasadefault.inolderversions,bumpitupto300-1000.
3.5wal_buffers
thissettingdecidesthenumberofbufferswal(writeaheadlog)canhave.ifyourdatabasehasmanywritetransactions,settingthisvaluebithigherthandefaultcouldresultbetterusageofdiskspace.experimentanddecide.agoodstartwouldbearound32-64correspondingto256-512kmemory.
4othertips
4.1checkyourfilesystem
onoslikelinux,whichoffersmultiplefilesystems,oneshouldbecarefulaboutchoosingtherightonefromaperformancepointofview.thereisnoagreementbetweenpostgresqlusersaboutwhichoneisbest.
contrarytopopularbelief,today'sjournalingfilesystemsarenotnecessarilyslowercomparedtonon-journalingones.ext2canbefasteronsomesetupsbuttherecoveryissuesgenerallymakeitsuseprohibitive.differentpeoplehavereportedwidelydifferentexperienceswiththespeedofext3,reiserfs,andxfs;quitepossiblythiskindofbenchmarkdependsonacombinationoffilesystem,disk/arrayconfiguration,osversion,anddatabasetablesizeanddistribution.assuch,youmaybebetteroffstickingwiththefilesystembestsupportedbyyourdistribution,suchasreiserfsforsuselinuxorext3forredhatlinux,nottoforgetxfsknownforit'slargefilesupport.ofcourse,ifyouhavetimetoruncomprehensivebenchmarks,wewouldbeinterestedinseeingtheresults!
|||asaneasyperformanceboostwithnodownside,makesurethefilesystemonwhichyourdatabaseiskeptismounted"noatime",whichturnsofftheaccesstimebookkeeping.
4.2trytheautovacuumdaemon
thereisalittleknownmoduleinpostgresqlcontribdirectorycalledaspgavd.itworksinconjunctionwithstatisticscollector.itperiodicallyconnectstoadatabaseandchecksifithasdoneenoughoperationssincethelastcheck.ifyes,itwillvacuumthedatabase.
essentiallyitwillvacuumthedatabasewhenitneedsit.itwouldgetridofplayingwithcronsettingsforvacuumfrequency.itshouldresultinbetterdatabaseperformancebyeliminatingoverduevacuumissues.
4.3tryfreebsd
largeupdates,deletes,andvacuuminpostgresqlareverydiskintensiveprocesses.inparticular,sincevacuumgobblesupiobandwidth,therestofthedatabaseactivitiescouldbeaffectedadverselywhenvacuumingverylargetables.
os'sfromthebsdfamily,suchasfreebsd,dynamicallyaltertheiopriorityofaprocess.soifyoulowerthepriorityofavacuumprocess,itshouldnotchewasmuchbandwidthandwillbetterallowthedatabasetoperformnormally.ofcoursethismeansthatvacuumcouldtakelonger,whichwouldbeproblematicfora"vacuumfull."
ifyouarenotdonewithyourchoiceofosforyourserverplatform,considerbsdforthisreason.
5theconfsettingguide
availablehereisanannotatedguidetothepostgresqlconfigurationfilesettings,inbothopenoffice.organdpdfformat.thisguideexpandsontheofficialdocumentationandmayeventuallybeincorporatedintoit.
新聞熱點(diǎn)
疑難解答
圖片精選