Talk:Compiler/Simple file inclusion pre processor

wrong language

This task would probably be more useful and help compare different languages better (the point of this site) if it handled the same syntax used by Compiler/Sample_programs, rather than ALGOL 68 processes ALGOL 68, C processes C, PL/1 processes PL/1, Phix processes Phix, etc. --Pete Lomax (talk) 17:35, 6 June 2021 (UTC)

You have a point, but by processing the actual language, it compares the language's pre-processor facilities and also how hard pre-processing the source is.

I strongly suspect that the syntax of Algol 68, PL/1 and COBOL (for example) makes the implementation harder than a C pre-processor would be, which was part of my motivation.

To pre-process Algol 68, PL/1 and COBOL for this task requires some level of lexical analysis, whilst a C pre-processor need only find lines that start with "#include" ( possibly with spaces before and after the "#" ). --Tigerofdarkness (talk) 18:25, 6 June 2021 (UTC)

Thanks for considering and attempting the task, BTW. --Tigerofdarkness (talk) 18:36, 6 June 2021 (UTC)

As per the abortive Phix entry each submission c/should explain some of the issues that might arise were it applied to the specific language itself instead of the toy C.

(And at least in the Phix case extol the virtues of any existing builtin mechanisms that provide similar functionality.)

The task is about how the builtin mechanism can be provided - the Phix compiler (as I understand it) is written in Phix and so presumably the facilities are provided by code in Phix? --Tigerofdarkness (talk) 21:42, 6 June 2021 (UTC)

It would without question still require some form of lexical analysis anyway, eg

/*
#include tests
*/
#include hello_world.t
#include phoenix_number.t
print("#include tests completed\n")

with expected output of

Hello, World!
142857
#include tests completed

I would specifically state the task need not attempt to resolve any issues caused by using/declaring the same identifiers in multiple files.

I (still) think the task should require toy C to be targetted, with any host-language-specific code parked on sub-pages. --Pete Lomax (talk) 20:17, 6 June 2021 (UTC)

I'm not sure why you mention "issues caused by using/declaring the same identifiers in multiple files" - my mention of "Many programming languages allow file inclusion, so that for example, standard declarations or code-snippets can be stored in a separate source file and be included into the programs that require them" was meant to indicate why pre-proessors were used.

Some lexical analysis is required to parse #include lines, but not as much as in e.g. PL/1 where you could see: main: procedure; %include "someCode.incl.pl1"; end main;.

I still argue that the task as written is a task that compares languages. In particular it compares the pre-processing facilities of languages.

The Self-hosting compiler task is a much harder, language specific task - it doesn't for example, say write a C compiler in your language. The Compiler/lexical_analyzer and related tasks are about writing a compiler for a simple C like language. The Algol 68 pre-processor in Algol 68 is similar in size to one of those tasks, so I think this task is not too large for an RC task.

Also, I doubt that a C pre-processor written in anything but C would get much use for anything other than bootstrapping... --Tigerofdarkness (talk) 21:42, 6 June 2021 (UTC)