Run-length encoding/C: Difference between revisions

no edit summary
(→‎[[Run-length encoding/C#Unbuffered - generative: The following code sample is experimental as it implements python style iterators.)
No edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 10:
 
Also: In this implementation, '''string'''s are assumed to be null terminated. Hence cannot contain the ''null'' character as data.
<langsyntaxhighlight lang="c">#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
const char *input = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW";
const char *output = "12W1B12W3B24W1B14W";
 
typedef void (*YIELDCHAR)(); /* yields via yield_init */
Line 29:
#define GOTO(label) longjmp(label, TRUE)
#endif
 
/* the following line is the only time I have ever required "auto" */
#define FOR(i,iterator) auto BOOL lambda(i); yield_init = (void *)&lambda; iterator; BOOL lambda(i)
Line 37:
#define CONTINUE return TRUE
#define OD CONTINUE; }
/* Warning: _Most_ FOR(,){ } loops _must_ have a CONTINUE, as the last statement. BREAK
* or OD as the terminating statement. Otherwise the lambda will
* Otherwise the lambda will return random value from stack, and may terminate early */
 
typedef BOOL ITERATOR;
static volatile void *yield_init; /* not thread safe */
#define YIELDS(type) BOOL (*yield)(type) = yield_init
 
ITERATOR gen_char_seq (const char *s){
YIELDS(char);
fflush(stdout);
int upb_s = strlen(s);
int i; for(i = 0; i <= upb_s; i++) YIELD(s[i]);
}
 
Line 67 ⟶ 68:
count = 1;
YIELD(prev); prev = c;
} else {
count += 1;
}
OD;
Line 78 ⟶ 79:
}
 
const char *zero2nine = "0123456789";
 
ITERATOR gen_decode (GENCHAR gen_char){
Line 86 ⟶ 87:
if(strchr(zero2nine, c)){
repeat = repeat*10 + c - '0';
} else {
int i; for(i = 1; i <= repeat; i++ ){ YIELD(c); }
repeat = 0;
}
Line 93 ⟶ 94:
}
 
int main(){
/* iterate through input string */
printf("Encode input: ");
Line 100 ⟶ 101:
OD;
printf("\n");
 
/* iterate through output string */
printf("Decode output: ");
Line 107 ⟶ 108:
OD;
printf("\n");
return 0;
}</lang>
}</syntaxhighlight>
Output:
<pre>
Line 113 ⟶ 115:
Decode output: WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW
</pre>
 
=== Buffered ===
These functions have no check for the size of the output buffers.
Line 120 ⟶ 123:
Since repeat counter must fit a single byte in this implementation, it can't be greater than 255, so a byte repeated more than 255 times generates in the compressed stream more than 2 bytes (4 bytes if the length of the repeated byte sequence is less than 511 and so on)
 
<syntaxhighlight lang="c">
<lang c>int rle_encode(char *out, const char *in, int l)
{
int dl, i;
Line 142 ⟶ 146:
*out++ = i; *out++ = cp; dl += 2;
return dl;
}
}</lang>
</syntaxhighlight>
 
'''Decoding function'''
 
<syntaxhighlight lang="c">
<lang c>int rle_decode(char *out, const char *in, int l)
{
int i, j, tb;
Line 158 ⟶ 164:
}
return tb;
}
}</lang>
</syntaxhighlight>
 
'''Usage example'''
 
<langsyntaxhighlight lang="c">#include <stdio.h>
#include <stdlib.h>
#include <string.h>
Line 170 ⟶ 177:
int main()
{
char *d = (char *)malloc(2*strlen(o));
char *oc = (char *)malloc(strlen(o));
int rl = rle_encode(d, o, strlen(o));
Line 181 ⟶ 188:
free(d); free(oc);
return 0;
}
}</lang>
</syntaxhighlight>
 
In the following codes, encoding and decoding are implementendimplemented as "filters" which compress/decompress standard input to standard output writing ASCII strings; they will work as long as the input has no ASCII digits in it, and the compressed/original ratio of a "single group" will be less than or equal to 1 as long as the ASCII decimal representation's length of the repeat counter will be shorter than the length of the "group". It should be so except in the case the group is a single isolated character, e.g. B gives 1B (one byte against two compressed bytes)
 
'''Encoding filter'''
 
<syntaxhighlight lang="c">
<lang c>#include <stdio.h>
 
int main()
Line 203 ⟶ 212:
}
printf("%d%c", i, cp);
return 0;
}</lang>
}
</syntaxhighlight>
 
'''Decoding filter'''
 
<syntaxhighlight lang="c">
<lang c>#include <stdio.h>
 
int main()
Line 216 ⟶ 228:
for(j=0; j < i; j++) printf("%c", c);
}
return 0;
}</lang>
}
</syntaxhighlight>
 
'''Final note''': since the repeat counter value 0 has no meaning, it could be used as it would be 256, so extending by one the maximum number of repetitions representable with a single byte; or instead it could be used as a special marker to encode in a more efficient way (long) sequences of ''isolated characters'', e.g. "ABCDE" would be encoded as "1A1B1C1D1E"; it could be instead encoded as "05ABCDE".