File size distribution
File Size Distribution
Beginning from the current directory, or optionally from a directory specified as a command-line argument, determine how many files there are of various sizes in a directory hierarchy. My suggestion is to sort by logarithmn of file size, since a few bytes here or there, or even a factor of two or three, may not be that significant. Don't forget that empty files may exist, to serve as a marker. Is your file system predominantly devoted to a large number of smaller files, or a smaller number of huge files?
zkl
<lang zkl>pipe:=Thread.Pipe();
// hoover all files in tree, don't return directories
fcn(pipe,dir){ File.globular(dir,"*",True,8,pipe); } .launch(pipe,vm.arglist[0]); // thread
dist,N,SZ,maxd:=List.createLong(50,0),0,0,0; foreach fnm in (pipe){
sz,szd:=File.len(fnm), sz.numDigits; dist[szd]+=1; N+=1; SZ+=sz; maxd=maxd.max(szd);
} println("Found %d files, %,d bytes, %,d mean.".fmt(N,SZ,SZ/N)); scale:=50.0/(0.0).max(dist); println("%15s %s (* = %.2f)".fmt("File size","Number of files",1.0/scale)); foreach sz,cnt in ([0..].zip(dist[0,maxd])){
z:="%,d".fmt((10).pow(sz)).replace("1","n").replace("0","n"); println("%15s : %s".fmt(z,"*"*(scale*cnt).round().toInt()));
}</lang>
- Output:
$ zkl flSzDist.zkl .. Found 1832 files, 108,667,806 bytes, 59,316 mean. File size Number of files (* = 13.44) n : * nn : *** nnn : ******** n,nnn : ********************************** nn,nnn : ************************************************** nnn,nnn : ******************************** n,nnn,nnn : ******* $ zkl flSzDist.zkl /media/Tunes/ Found 4320 files, 67,627,849,052 bytes, 15,654,594 mean. File size Number of files (* = 69.84) n : nn : nnn : n,nnn : * nn,nnn : nnn,nnn : n,nnn,nnn : * nn,nnn,nnn : ************************************************** nnn,nnn,nnn : ******** n,nnn,nnn,nnn : *