File size distribution: Difference between revisions
Content added Content deleted
Thundergnat (talk | contribs) (→{{header|Perl 6}}: Add a Perl 6 example) |
(Added C implementation for Windows.) |
||
Line 3: | Line 3: | ||
Beginning from the current directory, or optionally from a directory specified as a command-line argument, determine how many files there are of various sizes in a directory hierarchy. My suggestion is to sort by logarithmn of file size, since a few bytes here or there, or even a factor of two or three, may not be that significant. Don't forget that empty files may exist, to serve as a marker. Is your file system predominantly devoted to a large number of smaller files, or a smaller number of huge files? |
Beginning from the current directory, or optionally from a directory specified as a command-line argument, determine how many files there are of various sizes in a directory hierarchy. My suggestion is to sort by logarithmn of file size, since a few bytes here or there, or even a factor of two or three, may not be that significant. Don't forget that empty files may exist, to serve as a marker. Is your file system predominantly devoted to a large number of smaller files, or a smaller number of huge files? |
||
=={{header|C}}== |
|||
The platform independent way to get the file size in C involves opening every file and reading the size. The implementation below works for Windows and utilizes command scripts to get size information quickly even for a large number of files, recursively traversing a large number of directories. Both textual and graphical ( ASCII ) outputs are shown. The same can be done for Linux by a combination of the find, ls and stat commands and my plan was to make it work on both OS types, but I don't have access to a Linux system right now. This would also mean either abandoning scaling the graphical output in order to fit the console buffer or porting that as well, thus including windows.h selectively. |
|||
===Windows=== |
|||
<lang C> |
|||
/*Abhishek Ghosh, 13th October 2017*/ |
|||
#include<windows.h> |
|||
#include<string.h> |
|||
#include<stdio.h> |
|||
#define MAXORDER 25 |
|||
int main(int argC, char* argV[]) |
|||
{ |
|||
char str[MAXORDER],commandString[1000],*startPath; |
|||
long int* fileSizeLog = (long int*)calloc(sizeof(long int),MAXORDER),max; |
|||
int i,j,len; |
|||
double scale; |
|||
FILE* fp; |
|||
if(argC==1) |
|||
printf("Usage : %s <followed by directory to start search from(. for current dir), followed by \n optional parameters (T or G) to show text or graph output>",argV[0]); |
|||
else{ |
|||
if(strchr(argV[1],' ')!=NULL){ |
|||
len = strlen(argV[1]); |
|||
startPath = (char*)malloc((len+2)*sizeof(char)); |
|||
startPath[0] = '\"'; |
|||
startPath[len+1]='\"'; |
|||
strncpy(startPath+1,argV[1],len); |
|||
startPath[len+2] = argV[1][len]; |
|||
sprintf(commandString,"forfiles /p %s /s /c \"cmd /c echo @fsize\" 2>&1",startPath); |
|||
} |
|||
else if(strlen(argV[1])==1 && argV[1][0]=='.') |
|||
strcpy(commandString,"forfiles /s /c \"cmd /c echo @fsize\" 2>&1"); |
|||
else |
|||
sprintf(commandString,"forfiles /p %s /s /c \"cmd /c echo @fsize\" 2>&1",argV[1]); |
|||
fp = popen(commandString,"r"); |
|||
while(fgets(str,100,fp)!=NULL){ |
|||
if(str[0]=='0') |
|||
fileSizeLog[0]++; |
|||
else |
|||
fileSizeLog[strlen(str)]++; |
|||
} |
|||
if(argC==2 || (argC==3 && (argV[2][0]=='t'||argV[2][0]=='T'))){ |
|||
for(i=0;i<MAXORDER;i++){ |
|||
printf("\nSize Order < 10^%2d bytes : %Ld",i,fileSizeLog[i]); |
|||
} |
|||
} |
|||
else if(argC==3 && (argV[2][0]=='g'||argV[2][0]=='G')){ |
|||
CONSOLE_SCREEN_BUFFER_INFO csbi; |
|||
int val = GetConsoleScreenBufferInfo(GetStdHandle( STD_OUTPUT_HANDLE ),&csbi); |
|||
if(val) |
|||
{ |
|||
max = fileSizeLog[0]; |
|||
for(i=1;i<MAXORDER;i++) |
|||
(fileSizeLog[i]>max)?max=fileSizeLog[i]:max; |
|||
(max < csbi.dwSize.X)?(scale=1):(scale=(1.0*(csbi.dwSize.X-50))/max); |
|||
for(i=0;i<MAXORDER;i++){ |
|||
printf("\nSize Order < 10^%2d bytes |",i); |
|||
for(j=0;j<(int)(scale*fileSizeLog[i]);j++) |
|||
printf("%c",219); |
|||
printf("%Ld",fileSizeLog[i]); |
|||
} |
|||
} |
|||
} |
|||
return 0; |
|||
} |
|||
} |
|||
</lang> |
|||
Invocation and textual output : |
|||
<pre> |
|||
C:\My Projects\threeJS>fileSize.exe "C:\My Projects" t |
|||
Size Order < 10^ 0 bytes : 1770 |
|||
Size Order < 10^ 1 bytes : 1 |
|||
Size Order < 10^ 2 bytes : 20 |
|||
Size Order < 10^ 3 bytes : 219 |
|||
Size Order < 10^ 4 bytes : 1793 |
|||
Size Order < 10^ 5 bytes : 1832 |
|||
Size Order < 10^ 6 bytes : 631 |
|||
Size Order < 10^ 7 bytes : 124 |
|||
Size Order < 10^ 8 bytes : 26 |
|||
Size Order < 10^ 9 bytes : 0 |
|||
Size Order < 10^10 bytes : 0 |
|||
Size Order < 10^11 bytes : 0 |
|||
Size Order < 10^12 bytes : 0 |
|||
Size Order < 10^13 bytes : 0 |
|||
Size Order < 10^14 bytes : 0 |
|||
Size Order < 10^15 bytes : 0 |
|||
Size Order < 10^16 bytes : 0 |
|||
Size Order < 10^17 bytes : 0 |
|||
Size Order < 10^18 bytes : 0 |
|||
Size Order < 10^19 bytes : 0 |
|||
Size Order < 10^20 bytes : 0 |
|||
Size Order < 10^21 bytes : 0 |
|||
Size Order < 10^22 bytes : 0 |
|||
Size Order < 10^23 bytes : 0 |
|||
Size Order < 10^24 bytes : 0 |
|||
</pre> |
|||
Invocation and graphical output : |
|||
<pre> |
|||
C:\My Projects\threeJS>fileSize.exe "C:\My Projects" g |
|||
1832,300, 0.136463 |
|||
Size Order < 10^ 0 bytes |█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████1770 |
|||
Size Order < 10^ 1 bytes |1 |
|||
Size Order < 10^ 2 bytes |██20 |
|||
Size Order < 10^ 3 bytes |█████████████████████████████219 |
|||
Size Order < 10^ 4 bytes |████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████1793 |
|||
Size Order < 10^ 5 bytes |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████1832 |
|||
Size Order < 10^ 6 bytes |██████████████████████████████████████████████████████████████████████████████████████631 |
|||
Size Order < 10^ 7 bytes |████████████████124 |
|||
Size Order < 10^ 8 bytes |███26 |
|||
Size Order < 10^ 9 bytes |0 |
|||
Size Order < 10^10 bytes |0 |
|||
Size Order < 10^11 bytes |0 |
|||
Size Order < 10^12 bytes |0 |
|||
Size Order < 10^13 bytes |0 |
|||
Size Order < 10^14 bytes |0 |
|||
Size Order < 10^15 bytes |0 |
|||
Size Order < 10^16 bytes |0 |
|||
Size Order < 10^17 bytes |0 |
|||
Size Order < 10^18 bytes |0 |
|||
Size Order < 10^19 bytes |0 |
|||
Size Order < 10^20 bytes |0 |
|||
Size Order < 10^21 bytes |0 |
|||
Size Order < 10^22 bytes |0 |
|||
Size Order < 10^23 bytes |0 |
|||
Size Order < 10^24 bytes |0 |
|||
</pre> |
|||
Note that it is possible to track files up to 10^24 (Yottabyte) in size with this implementation, but if you have a file that large, you shouldn't be needing such programs. :) |
|||
=={{header|Perl 6}}== |
=={{header|Perl 6}}== |
||
{{works with|Rakudo|2017.05}} |
{{works with|Rakudo|2017.05}} |