Category:Smalltalk: Difference between revisions

 
(36 intermediate revisions by the same user not shown)
Line 14:
'''Smalltalk-80''' is an [[object-oriented programming|object-oriented]], dynamically typed, [[reflective programming]] language. It was designed and created in part for educational use, more so for Constructivist teaching, at Xerox PARC by Alan Kay, Dan Ingalls, Ted Kaehler, Adele Goldberg, and others during the 1970s, influenced by Sketchpad and Simula.
 
The language was generally released as Smalltalk-80 and has been widely used since. Smalltalk-like languages are in continuing active development, and the language has gathered a loyal community of users around it.
 
'''Smalltalk-80''' is a fully reflective system, implemented in itself. Smalltalk-80 provides both structural and computational reflection. Smalltalk is a structurally reflective system whose structure is defined by Smalltalk-80 objects. The classes and methods that define the system are themselves objects and fully part of the system that they help define. The Smalltalk compiler compiles textual source code into method objects, typically instances of <code>CompiledMethod</code>. These get added to classes by storing them in a class's method dictionary. The part of the class hierarchy that defines classes can add new classes to the system. The system is extended by running Smalltalk-80 code that creates or redefines classes and methods. In this way a Smalltalk-80 system is a "living" system, carrying around the ability to extend itself at run-time.
Line 88:
methodSpec ::= <unarySelector>
| <binarySelector> <argName>
| <keyword> <argName> [ <keyword> <argName> ] *
 
methodBody ::= [ <localVarDecl> ] <statementList>
Line 245:
 
=== Message Sending ===
Message sends are dynamically resolved, by letting the receiver of the message determine the method (aka code to run). The message consist of a selector (= name of the message) and optional arguments. <br>The message syntax as:
<lang smalltalk>receiver selector</lang>
a unary message, without arguments.<br>In a C-like language, this would be written as "<I>receiver.selector()</I>".
 
<lang smalltalk>receiver part1: arg1 part2: arg2 ... partN: argN</lang>
a keyword message; the selector consists of the concatenation of the keyword parts: '<I>part1:part2:...partN:</I>'.<br>In a C-like language (assuming that colons are allowed in an identifier), this would be written as "<I>receiver.part1:part2:...partN:(arg1, arg2,... argN)</I>".
 
<lang smalltalk>receiver op arg</lang>
a so called ''binary message''. The selector 'op' consists of one or more special characters, such as '+', -', '@' etc.
These are actually syntactic sugar, especially to make arithmetic look more familiar (i.e. instead of "rcvr add: foo" we can write "rcvr + foo").<br>These would look the same in a C-like language, although almost any non-letter-or-digit character is allowed as operator in Smalltalk.
 
The precedence rules are unary > binary > keyword, thus
Line 268:
Classes themself are objects and as such instances of some Metaclass. As classes define and provide the protocol (=set of methods) for their instances, metaclasses define and provide the protocol for their instances, the corresponding class. Every class has its own metaclass and as such can implement new class-side messages. Typically, instance creation and utility code is found on the class side. Most Smalltalk dialects allow for the metaclass to specify and return the type of compiler or other tools to be used when code is to be installed. This allows for DSLs or other programming language syntax to be implemented seamlessly by defining a metaclass which returns a compiler for a non-Smalltalk. Typical examples for this are parser generators (tgen, ometa, petite parser), data representation specs (asn1, xml etc.) and languages (smallRuby, graphical languages in squeak etc.)
 
Being objects, classes and metaclasses can be created dynamically, by sending a #subclass:... message to another class, or myby instantiating a new metaclass.
 
=== Exception Handling ===
Line 353:
By convention, global and namespace variables are to be used for classes only. As in every other language, it is considered extremely bad style to use globals for anything else.
They are created by telling the namespace to add a binding, as in <lang smalltalk>Smalltalk at:#Foo put:<something></lang>
The binding's value can be retrieved by asking the namespace: <lang smalltalk>x := Smalltalk at:#Foo</lang> or simply by referring to the binding by name (if the corresponding namespace is visible in the scope of the code): <lang smalltalk>x := Foo</lang>
As seen above, the global named <tt>Smalltalk</tt> refers to the Smalltalk namespace, which is globally visible. (technicallyTechnically, there is a binding to itself) inside the Smalltalk namespace, and the Smalltalk namespace is visible everywhere by default.
 
====Class Variables (Statics)====
These are visible inside a class and shared with all of its subclasses. They have the lifetime of the defining class. The same binding is shared with all subclasses, thus the value is seen in the defining class and all of its subclasses. Typically, these are used for constants. Class variables are defined in the class definition message when the class is instantiated or redefined by setters to the class.
 
====Instance Variables====
These are the slots where the private state of an object is held. Instances have the same structure (layout), but each has its own values. The instance layout is defined in the class definition message when the class is instantiated or redefined by setters to the class.
Line 383 ⟶ 384:
 
===Special "builtin" Pseudo Variables===
<lang smalltalk>self "refers to the current receiver"
 
super "for super sends (to call the method in a superclass)"
 
thisContext "refers to the current context (stack frame/continuation) as an object"</lang>
 
===Message Sends===
<lang smalltalk>1000 factorial "send the 'factorial' message to the integer receiver"
 
a factorial even "send the 'factorial' message to whatever "'a"' refers to, then send 'even' to whatever that returned"
then send 'even' to whatever that returned"
 
a + 1 "send a '+' message, passing the argument '1' to whatever 'a' refers to"
 
(a + 1) squared "send '+' to 'a', then send 'squared' to whatever we get from it"
 
a , b "send the (binary) ',' message, which does collection-concatenation (arrays, strings, etc)"
which performs collection-concatenation (arrays, strings, etc)"
 
arr at:1 put:'foo' "send the 'at:put:' message to 'arr', passing two arguments, the integer '1' and a string"
passing two arguments, the integer '1' and a string"
 
a > b ifTrue: [ a print ] "send the 'ifTrue:' message to whatever 'a > b' returned (a boolean, usually), passing a block closure as argument. The implementation of boolean will either evaluate or not evaluate the passed block's code"
passing a block closure as argument.
The implementation of boolean will either evaluate
or not evaluate the passed block's code"
 
a > b ifTrue: [ a ] ifFalse: [b] "send 'ifTrue:ifFalse:' to the object returned by 'a > b', passing two block closures as arguments. The 'ifTrue:ifFalse:' method will evaluate one of them and return that block's return value as its own return value"
passing two block closures as arguments.
The 'ifTrue:ifFalse:' method will evaluate one of them
and return that block's return value as its own return value"
 
b := [ ... somCode... ]. "assign a block to the variable 'b'"
...
b value "evaluate the block's code (call the lambda closure)"
 
b2 value:123 "evaluate another block's code, passing anone argument"</lang>
 
=== Other ===
<lang smalltalk>expr1 . expr2 "expressions (statements) within a method or block are separated by a fullstopfull stop."
 
'hello' print. 'world' print "statementsexpressions are separated by a periodfull stop; just like in english"
foo := bar "assignment; let foo refer to the object to which bar refers to (at that particular point in time)"
 
foo := bar "assignment; let foo refer to the object to which bar refers to (at that particular point in time)"
^ a + 1 "return; the value of the 'a+1' message send is returned as the value of the current method invocation."
(at that particular point in time)"
 
foo := bar := 0. "assignment has a value"
'hello' print. 'world' print "statements are separated by a period; just like in english"
 
|^ a b+ c|1 "local variables "return; introducesthe value of 'a+1', 'b'as andthe 'c'value inof the current scope (let-like localmethod bindings)invocation."
 
|a b c| "local variables; introduces 'a', 'b' and 'c' in the current scope
r msg1; msg2 "so called cascade; first send msg1 to r, ignoring the return value, then send msg2.
The value of the expression is result from(let-like lastlocal message.bindings)"
 
Syntactic sugar for (t := r) msg1. t msg2
r msg1; msg2 "so called cascade;
but an expression, not a statement (and with an anonymous variable 't')"</lang>
r msg1; msg2 "so called cascade; first send msg1 to r, ignoring the return value, then send msg2.
then send msg2.
The value of the expression is result from last message.
Syntactic sugar for (t := r) msg1. t msg2
but an expression, not a statement (and with an anonymous variable 't')"</lang>
 
=== Class Definition ===
Line 439 ⟶ 455:
</lang>
Classes can be anonymous, in most systems, you can "Class new new" to create a new anonymous class and an instance of it (but be aware, that the class should contain enough state to properly specify their instance's layout, so usually more info is needed (number and names of instance variables, inheritance etc.). Otherwise some tools (Inspector) may fail to reflect on it.
 
Be reminded that classes are themself objects and therefore instances of some class; in this case a so called <I>Metaclass</I>. They follow the same inheritance and message send semantics as ordinary objects. Thus, you can define class-side methods and redefine them in subclasses. For example, you can redefine the "new" method to implement caches, singletons or any other special feature. Often multiple instance creation methods (with different parameters) are provided by a class.
 
The class also provides reflection protocol, eg. to retrieve all of its instances, to ask for the names of private slots, to ask for the set of supported messages etc.
 
=== Control Structures ===
As mentioned above, these are defined as messages and their implementation is found in the corresponding receiver classes. For illustration, here is how conditional execution ('if-then-else')s is implemented in Smalltalkclass. There are two boolean objects named "true" and "false", which are singletons of corresponding classes named "True" and "False" (both inherit from Boolean, which inherits from Object). It is essential, that these are singletons, and that typical relational operators like "<", ">" etc. return one of those two.
 
<br>Then, in the True class, define:
For illustration, here is how conditional execution ('if-then-else') is implemented in Smalltalk. There are two boolean objects named "true" and "false", which are singletons of corresponding classes named "True" and "False" (both inherit from Boolean, which inherits from Object). It is essential, that these are singletons, and that typical relational operators like "<", ">" etc. return one of those two.
 
<br>Then, in the True class, define:
<lang smalltalk>ifYouAreTrueThenDo: arg1 ifNotThenDo: arg2
^ arg1 value</lang>
Line 451 ⟶ 474:
<lang smalltalk>(a > 0) ifYouAreTrueThenDo:[ 'positive' printCR ] ifNotThenDo:[ 'negative' printCR ]</lang>
actually, because these two return the value of the block they evaluated, it can also be used for its value (in C, the ternary if expression), as in:
<lang smalltalk>(outcome := (a > 0) ifYouAreTrueThenDo:['positive'] ifNotThenDo:['negative'] ) printCR</lang>.
outcome printCR</lang>
Finally, by adding a self-returning "value" method to the Object class, we can also write:
<lang smalltalk>( (a > 0) ifYouAreTrueThenDo:'positive' ifNotThenDo:'negative' ) printCR</lang>
(knowing that the if "ifXX"-methods send <tt>"value"</tt> to the corresponding arg and return that)
 
In this style, all of Smalltalk's control structures, loops, enumeration, stream readers and event- theor exception handling constructs are built. And since every class is open for extension, you can easily add additional convenient control functions (which is one place, where dialects differ, so very often, one has to transport some of those when porting apps from one dialect to another).
 
The following is only a tiny subset - there are virtually hundreds or thousands of uses of blocks for control structures in the system.
Line 468 ⟶ 492:
[ loop code . condition expression ] whileTrue.
[ loop code ] doWhile:[ condition ].
[ loop code ] doUntil:[ condition ].
 
n timesRepeat:[ block to be looped over ] "n being an integer"
Line 481 ⟶ 506:
 
[some code] fork "starts a new thread"</lang>
 
=== Return from a Block ===
it should be noted that a " ˆ " (return) inside a block will return from the enclosing method, NOT only from the block. And that this is an essential semantic property of the return (technically, it may be a long return from a deeply nested call hierarchy, possibly involving unwind actions).
 
This makes it possible to pass a block to eg. collections to enumerate elements up-to and until some condition is met. For example, if we need a helper method, which searches the first element in a dataset to some condition and evaluate an action on it, we can write:
<lang smalltalk>findSomeElementWhichMeetsCondition:conditionBlock thenDo:actionBlock ifNone:failBlock
dataSet do:[:eachElement |
(conditionBlock value:eachElement) ifTrue:[
^ actionBlock value:eachElement
]
].
^ failBlock value</lang>
Here, a block is passed to the dataSet's "do:" method, which will return (if invoked inside the "do:") from the containing findSomeElement method.
The above can be used as:
<lang smalltalk>myDataSet
findSomeElementWhichMeetsCondition:[:record | record name = 'John']
thenDo:[:record | record print ]
ifNone:[ 'nothing found' print ]</lang>
If a block-return (as eg. in JavaScript) would only return from the inner scope, this wasn't possible, and ugly workarounds (like exceptions or long-jumps) were needed.
 
There are rare situations, where an explicit block return is needed (for example, to break out of a loop in the middle, without returning from the method). For this, block provides a special "valueWithExit" method, so you can write:
<lang smalltalk>1 to:10 do:[:outerLoopsI |
[:exit |
1 to:10 do:[:innerLoopsI |
...
someCondition ifTrue:exit
]
] valueWithExit
]</lang>
 
=== Exceptions and Handlers ===
Line 487 ⟶ 541:
Smalltalk supports proceedable exceptions.
 
<lang smalltalk>[ try block to be evaluated ] on: exception do:[:ex | handler code ]</lang>
thewhere 'ex<I>exception</I>' argumentis toan Exception class or Signal instance and the hander'<I>ex</I>' argument provides detail information (where and why) to the hander and also allows control of how to continue after the handlerafterwards (proceed, return, restart, reject).
<br>The handler basically has the following options:
* ex return - return out of the try block
Line 499 ⟶ 553:
 
Handlers can also be defined to handle a collection of non-related exceptions, by creating an exceptionSet:
<lang smalltalk>[ try block to be evaluated ] on: (ZeroDivide, DomainError) do:[:ex | handler ]</lang>
finally, many dialects provide syntactic sugar for common situations:
<lang smalltalk>exception catch: [ action ] "to return out of the action, without any particular handler action"
Line 514 ⟶ 568:
 
=== Multithreading ===
New threads are started by sending 'fork' to a block; this will create a process instance &sup1; which executes the block's code in a separate thread (within the same address space):
<lang smalltalk>[ do something ] fork.
 
[ do something ] forkAt: priorityLevel</lang>
1) Notice that these are technically threads, not "unix processes". They execute in the same address (or object-) space.
They are named "Process" and created with "fork" in Smalltalk for historic reasons.
 
The scheduling behavior is not standard among dialects/implementations. Some only manual switch by explicit yield, most provide sctrictstrict priority based scheduling, and some even provide preemptive timeslicing, and dynamic priorities. The details may also depend on the underlying operating system.
 
== Implementation ==
Line 528 ⟶ 582:
Some implementations support source-to-source compilation to C, JavaScript or Java. These may or may not show some limitations in the support for dynamic changes at execution time. Typically, the full dynamic bytecode is used for development, followed by a compilation phase for deployment/packaging.
 
All Smalltalks use and depend on garbage collection for automatic reclamation of unused objects, and most implementations use modern algorithms such as generation scavenging, incremental background mark&sweepcollectors, weak references and finalization support. Imprecise conservative collectors are typically not used. Reference counting was abandoned in the 70s.
 
As message send performance is critical in Smalltalk, highly tuned cache mechanisms have been invented and are used: inline caches, polymorph inline caches, dynamic recompilation based on receiver and/or argument types etc.
Line 534 ⟶ 588:
 
== Influences ==
Smalltalk syntax is meant to be read like English sentences, and messages look like orders "someone doSomethingWith: anArgumentsomeArgument".
 
The syntax is very compact and almost every semantic feature is implemented via a messsage send to some receiver object, instead of being a syntactic language feature of the compiler. As such, and because the compiler is part of the runtime environment, changes, fixes and enhancements of such features can be made easily (and are made). It has and is therefore often used as a testbed for research.
 
Smalltalk's symbols correspond to Lisp symbols, blocks are syntactic sugar for lambda closures.
Line 585 ⟶ 639:
ifTrue:[1]
ifFalse:[ self * (self-1) factorial ]</lang>
(here '<I>self</I>' is the Integer receiver object, and " ˆ " returns a value from the message send).
 
To get the factorial value, we'd evaluate in a workspace:<lang>10 factorial</lang>
Line 596 ⟶ 650:
 
a) is somewhat inconvenient if the code example consists of multiple methods, possibly in multiple classes.
<br>b) comes with the additional trouble that fileIn formats are different (chunk file, vs. XML file, vs. Monticello, vs. GNU-ST etc.).
 
For example, for export/import, GNU-ST uses a ''private format'', (which has the advantage of not needing the chunk format's bangs and especially the ugly bang doubling inside code and the empty chunk at the end):
<lang smalltalk>Number extend [
my_factorial [
Line 609 ⟶ 663:
]</lang>
 
However, both are incompatible and not supported by most other dialects, which use the historicalhistoric Smalltalk-80 chunk format:
<lang smalltalk>!Number methodsFor:'math'!
my_factorial
^ (self < 2) ifTrue:[1] ifFalse:[ self * (self-1) my_factorial]
! !</lang>
This chunk format is supported by all systems, but it is somewhat ugly to read and also needs exclamation marks to be doubled in the code (which looks especially bad in string literals).
 
Inside the Smalltalk IDE, you will never see any of the above, as this is only used as interchange format.
 
So in which dialect's fileOut format should the example be presented to be most convenient, readable, and to be repeatable in case someone wants to try Smalltalk?
Anonymous user