Talk:Binary strings: Difference between revisions

→‎Break out?: about "binary/byte string"
(→‎Break out?: about "binary/byte string")
Line 7:
: Then I've taken a look at those tasks and they do not focus on the concept of "byte strings", rather they refer to text strings. This is an ''issue'' if the text string implementation uses a terminator character, like C; and in fact the C solutions to those tasks ([[Copy a string]], [[String concatenation]], [[String length]]) work only for null-terminated string (i.e. "null" char can't be part of the string). (Of course this does not happen in every languages; but C is among those having this "problem"). I think it is enough to add some more C code to those tasks... '''Or''' maybe I gave the wrong name, should it be "Basic binary string manipulation functions"? (binary or according to Wikipedia bytestring) --[[User:ShinTakezou|ShinTakezou]] 15:23, 15 April 2009 (UTC)
::You're in a better position to figure that out than I am. I don't think I was ever really clear on byte strings and binary strings (probably because most of the string work I've done is in Java where most of the details are hidden or irrelevant). --[[User:Mwn3d|Mwn3d]] 19:58, 15 April 2009 (UTC)
::: There's nothing but a conventional distinction (but the following Java example says that after all it can be not so conventional after all...). Generally a string is just a sequence of "symbols" (bytes), even text are made of bytes of course... The distinction just stresses the fact that the bytes can be interpreted as text (according to which encoding...?) and are not generic binary data. Strings are not exactly "binary safe" in Java, but there's no terminator in use:
 
<lang java>public class binsafe {
public static void main(String[] args) {
System.out.print("\000\000test\001\377");
}
}</lang>
 
::: Outputs
 
<pre>$ java -cp . binsafe |hexdump -C
00000000 00 00 74 65 73 74 01 c3 bf |..test...|
00000009</pre>
 
::: Which looks odd since the byte 255 (octal 377) is oddly UTF-8 encoded, infact
 
<pre>$ printf "\xc3\xbf" |iconv -f utf-8 -t latin1 |hexdump -C
00000000 ff |.|
00000001</pre>
 
::: Maybe there's a method in the String class that says Java not to "interpret" the string, or maybe such a task in Java should be accomplished using a custom class innerly using byte[]. --[[User:ShinTakezou|ShinTakezou]] 21:53, 15 April 2009 (UTC)