Jump to content

File size distribution: Difference between revisions

Added C implementation for Windows.
(→‎{{header|Perl 6}}: Add a Perl 6 example)
(Added C implementation for Windows.)
Line 3:
Beginning from the current directory, or optionally from a directory specified as a command-line argument, determine how many files there are of various sizes in a directory hierarchy. My suggestion is to sort by logarithmn of file size, since a few bytes here or there, or even a factor of two or three, may not be that significant. Don't forget that empty files may exist, to serve as a marker. Is your file system predominantly devoted to a large number of smaller files, or a smaller number of huge files?
 
=={{header|C}}==
The platform independent way to get the file size in C involves opening every file and reading the size. The implementation below works for Windows and utilizes command scripts to get size information quickly even for a large number of files, recursively traversing a large number of directories. Both textual and graphical ( ASCII ) outputs are shown. The same can be done for Linux by a combination of the find, ls and stat commands and my plan was to make it work on both OS types, but I don't have access to a Linux system right now. This would also mean either abandoning scaling the graphical output in order to fit the console buffer or porting that as well, thus including windows.h selectively.
===Windows===
<lang C>
/*Abhishek Ghosh, 13th October 2017*/
 
#include<windows.h>
#include<string.h>
#include<stdio.h>
 
#define MAXORDER 25
 
int main(int argC, char* argV[])
{
char str[MAXORDER],commandString[1000],*startPath;
long int* fileSizeLog = (long int*)calloc(sizeof(long int),MAXORDER),max;
int i,j,len;
double scale;
FILE* fp;
if(argC==1)
printf("Usage : %s <followed by directory to start search from(. for current dir), followed by \n optional parameters (T or G) to show text or graph output>",argV[0]);
else{
if(strchr(argV[1],' ')!=NULL){
len = strlen(argV[1]);
startPath = (char*)malloc((len+2)*sizeof(char));
startPath[0] = '\"';
startPath[len+1]='\"';
strncpy(startPath+1,argV[1],len);
startPath[len+2] = argV[1][len];
sprintf(commandString,"forfiles /p %s /s /c \"cmd /c echo @fsize\" 2>&1",startPath);
}
else if(strlen(argV[1])==1 && argV[1][0]=='.')
strcpy(commandString,"forfiles /s /c \"cmd /c echo @fsize\" 2>&1");
else
sprintf(commandString,"forfiles /p %s /s /c \"cmd /c echo @fsize\" 2>&1",argV[1]);
 
fp = popen(commandString,"r");
 
while(fgets(str,100,fp)!=NULL){
if(str[0]=='0')
fileSizeLog[0]++;
else
fileSizeLog[strlen(str)]++;
}
if(argC==2 || (argC==3 && (argV[2][0]=='t'||argV[2][0]=='T'))){
for(i=0;i<MAXORDER;i++){
printf("\nSize Order < 10^%2d bytes : %Ld",i,fileSizeLog[i]);
}
}
else if(argC==3 && (argV[2][0]=='g'||argV[2][0]=='G')){
CONSOLE_SCREEN_BUFFER_INFO csbi;
int val = GetConsoleScreenBufferInfo(GetStdHandle( STD_OUTPUT_HANDLE ),&csbi);
if(val)
{
 
max = fileSizeLog[0];
for(i=1;i<MAXORDER;i++)
(fileSizeLog[i]>max)?max=fileSizeLog[i]:max;
(max < csbi.dwSize.X)?(scale=1):(scale=(1.0*(csbi.dwSize.X-50))/max);
for(i=0;i<MAXORDER;i++){
printf("\nSize Order < 10^%2d bytes |",i);
for(j=0;j<(int)(scale*fileSizeLog[i]);j++)
printf("%c",219);
printf("%Ld",fileSizeLog[i]);
}
}
}
return 0;
}
}
</lang>
Invocation and textual output :
<pre>
C:\My Projects\threeJS>fileSize.exe "C:\My Projects" t
 
Size Order < 10^ 0 bytes : 1770
Size Order < 10^ 1 bytes : 1
Size Order < 10^ 2 bytes : 20
Size Order < 10^ 3 bytes : 219
Size Order < 10^ 4 bytes : 1793
Size Order < 10^ 5 bytes : 1832
Size Order < 10^ 6 bytes : 631
Size Order < 10^ 7 bytes : 124
Size Order < 10^ 8 bytes : 26
Size Order < 10^ 9 bytes : 0
Size Order < 10^10 bytes : 0
Size Order < 10^11 bytes : 0
Size Order < 10^12 bytes : 0
Size Order < 10^13 bytes : 0
Size Order < 10^14 bytes : 0
Size Order < 10^15 bytes : 0
Size Order < 10^16 bytes : 0
Size Order < 10^17 bytes : 0
Size Order < 10^18 bytes : 0
Size Order < 10^19 bytes : 0
Size Order < 10^20 bytes : 0
Size Order < 10^21 bytes : 0
Size Order < 10^22 bytes : 0
Size Order < 10^23 bytes : 0
Size Order < 10^24 bytes : 0
</pre>
Invocation and graphical output :
<pre>
C:\My Projects\threeJS>fileSize.exe "C:\My Projects" g
1832,300, 0.136463
Size Order < 10^ 0 bytes |█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████1770
Size Order < 10^ 1 bytes |1
Size Order < 10^ 2 bytes |██20
Size Order < 10^ 3 bytes |█████████████████████████████219
Size Order < 10^ 4 bytes |████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████1793
Size Order < 10^ 5 bytes |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████1832
Size Order < 10^ 6 bytes |██████████████████████████████████████████████████████████████████████████████████████631
Size Order < 10^ 7 bytes |████████████████124
Size Order < 10^ 8 bytes |███26
Size Order < 10^ 9 bytes |0
Size Order < 10^10 bytes |0
Size Order < 10^11 bytes |0
Size Order < 10^12 bytes |0
Size Order < 10^13 bytes |0
Size Order < 10^14 bytes |0
Size Order < 10^15 bytes |0
Size Order < 10^16 bytes |0
Size Order < 10^17 bytes |0
Size Order < 10^18 bytes |0
Size Order < 10^19 bytes |0
Size Order < 10^20 bytes |0
Size Order < 10^21 bytes |0
Size Order < 10^22 bytes |0
Size Order < 10^23 bytes |0
Size Order < 10^24 bytes |0
</pre>
Note that it is possible to track files up to 10^24 (Yottabyte) in size with this implementation, but if you have a file that large, you shouldn't be needing such programs. :)
=={{header|Perl 6}}==
{{works with|Rakudo|2017.05}}
503

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.