sergio masci wrote:

>> I suspect that in many compilers (3) and (2) end up (as an
>> intermediate representation) in something that's equivalent to (1).
>> But even if not, it shouldn't be too difficult to make all three end
>> up in the same, whatever the compiler's internal representation of
>> this statement is. 
> 
> Ok so I write a seperate parser for all three and internally they all 
> produce:
> 
> 	if_statement
> 	.. expr
> 	.. .. lt
> 	.. .. .. x
> 	.. .. .. 0
> 	.. statement
> 	.. .. assign
> 	.. .. .. y
> 	.. .. .. 0
> 	.. statement
> 	.. .. assign
> 	.. .. .. y
> 	.. .. .. x
> 
> So what advantage does this give me?

I don't know, but if you do it, you probably know :)  Seriously, I don't
understand the question in this context.

> If instead of the keyword 'if' I used a function 'xyz' thus:
> 
> 	xyz( lt(x, 0), assign(y, 0), assign(y, x) )
> 
> my parsers would all now produce:
> 
> 	expr
> 	.. func
> 	.. .. xyz
> 	.. .. expr
> 	.. .. .. lt
> 	.. .. .. .. x
> 	.. .. .. .. 0
> 	.. .. expr
> 	.. .. .. assign
> 	.. .. .. .. y
> 	.. .. .. .. 0
> 	.. .. expr
> 	.. .. .. assign
> 	.. .. .. .. y
> 	.. .. .. .. x
> 
> Internally the compiler has different sections that deal with
> generating code for 'if_statment' trees, 'statement' trees and 'expr'
> trees. 
> 
> How would you propose that I treat 'xyz' differently while parsing
> and how would you add a specific code generator for the now special
> 'xyz' (you need to describe all this somehow)?

'if' is a special statement/construct/function, defined in the language
standard. 'xyz' is not. Therefore, the compiler can have (and generally
has) special code to generate 'if' more efficiently than a function
call.

Compared to the original issue -- lists --, 'if' is much more simple,
and the function call overhead here is important and more than the
actual functionality typically would be. That's why it generally makes
sense to implement such a construct directly, avoiding the function call
overhead. 

I'm not a compiler specialist, but it could be that 'avoiding the
function call overhead' is a premature optimization, and that a later,
lower-level optimization could result in just this. In any case, since
'if' is defined in the language standard, there's nothing that would
prevent a compiler writer to implement it in the compiler.

> the 'if_statement' code generator knows that it might have either one
> or two statements following the condition expression. It also knows
> that if the condition expression evaluates to a compile time constant
> that it can discard either the 'true' or 'false' statements. It also
> needs to interact with the code generator for the condition
> expression to allow that generator to produce efficient optimised
> code jumps to the 'true' or 'false' statements (think of early out
> logical expressions involving '&&' and '||'). Consider the difference
> between "if (cond)..." and "X=cond" 

Yes. This all is pretty much C. I thought we weren't really talking
about any specific languages, but about 'implemented in the compiler'
versus 'implemented in a library'. 

The 'if' statement implementation is so short that going through a
general-purpose function call convention would blow up the code
tremendously. But:
1) Nobody says that the compiler writer /has/ to do this. The 'if'
statement is defined in the language standard, and the compiler writer
can choose to implement it directly in the compiler.
2) Even if the compiler writer chooses to implement it as function call,
I think it is possible that a lower-level optimization detects the
inefficiencies and successfully optimizes the function call away
(remember that I considered that the function is available as source),
reaching the same code as if implemented directly.

>> So, yes, I think for a compiler these three could be identical. I
>> don't see that a compiler could derive any information from any of
>> them that it couldn't derive from the other two.
> 
> Actually it can. Consider a long complex program made up purely of
> functions as in (1). What happens with a misplaced comma or
> parenthesis? 

That's a feature of that specific syntax, not a difference whether the
'if' function is implemented in the compiler or in a library. We didn't
discuss the various merits of the different syntaxes (sp ?? :)

> The verbose syntax lets the compiler catch silly mistakes. 

Of course. The more redundant (that is, verbose) the syntax is, the
easier it is both for the programmer to get something wrong and for the
compiler to catch when something is wrong. But we didn't discuss the
merits of different syntaxes, we discussed merits of 'implemented in the
compiler' versus 'implemented in a standard library'.

>> Provided, of course, that the functions used in (1) are just as
>> defined as the operators and statements used in (2) and (3).
> 
> Ok so I'll give you that, all the functions are defined in exactly
> the same way in (1) as they are in (2) and (3). But what are we going
> to do about the vast number of functions defined in the /standard/
> library?

I don't understand the question. I didn't mean to suggest that all
functions need to have an equivalent in the forms (2) or (3), but rather
the other way round, that typically constructs of the forms (2) or (3)
have an equivalent function call syntax that does the same
(functionally, not necessarily in terms of typing or user-friendliness).


>>> AND because the compiler should be able to "understand" functions as
>>> easily as other language statements that it is a convenient way to
>>> extend the language. 
>> 
>> Yes, extend or customize. That's approximately the C++ standard
>> library way. 
> 
> But the C++ way is horrible! You have CLASS upon CLASS upon CLASS. If
> you want to write a modest program you end up so deep in 'standard'
> classes and templates that it gets very hard to see the wood for the
> trees.

This is not about a specific implementation of the principle, this is
about the principle. You always bring in C, despite (or because?) we
already agreed that the C way is pretty much horrible. And we probably
can agree that the BASIC way is horrible, too -- some exceptions
notwithstanding :)

> This nonsense that user classes should be written in such a way as to
> have special methods that the standard libraries expect (things like
> iterators) so that the items in a container can be accessed. The
> programmer shouldn't need to know about all this. He should be able
> to just say (e.g.)
> 
> 	for all items in list FRED do
> 
> 		*.x = $.x + 1
> 	done

I don't really understand this. This is probably your syntax, and quite
familiar to you, but I don't think a majority of list readers here would
know what this does. In any case, I don't.

Anyway, for what it's worth, and independently of the issue we're
discussing (compiler-built-in vs standard-library-implemented), my C++
code looks similar:

BOOST_FOREACH( item i, FRED ) {
  ++i.x;
}

(If this is what your code does... since I don't know what it does, I
can't really tell, but it probably is trivial to correct it if it's
doing something different. And I don't generally use identifiers like
FRED for lists in C++, but that's only a style question.)

But again, this is not a discussion of C++ style syntax versus BASIC
style syntax (yet :) -- and I don't see anything particularly
advantageous about the C++ style syntax. But all of C++'s list handling
is implemented in libraries -- and that's the issue. 

(I used here an element from the Boost library. It's not a standard
library, but it could be one. Whether or not a given library is a
standard library is just a matter of definition, and not of principle.)

Gerhard
-- 
http://www.piclist.com PIC/SX FAQ & list archive
View/change your membership options at
http://mailman.mit.edu/mailman/listinfo/piclist