On Wed, 22 Jul 2009, Gerhard Fiedler wrote:

> sergio masci wrote:
> 
> >> sergio masci wrote:
> >> 
> >>> If I understand you correctly what you are saying is that if the
> >>> compiler writer is already aware of the /standard/ library before
> >>> he starts writing the compiler AND the /standard/ library
> >>> definition is set in concrete just as the language definition is
> >>> then the compiler writer is able to use intimate knowledge of the
> >>> /standard/ library functions within the compiler without
> >>> incorporating the code generation of the /standard/ library
> >>> functions within the compiler but instead leaving this implemented
> >>> external to the compiler proper (so that these functions can be
> >>> written by someone else and code generated at compile time). 
> >> 
> >> Correct. 
> >> 
> >> I thought this all is basically understood when talking about
> >> standard libraries (that is, libraries with an interface that is
> >> part of the language standard).
> > 
> > Ok so I'm starting to get on the same page as you (I may not agree
> > but at least I now understand how you are seeing things :)
> 
> I thought that there was some disconnect, and I hoped that we would get
> there eventually :)
> 
> > Furthermore you seem to be saying that using a function call syntax
> > rather than a verbose statement syntax (e.g. SQL) should be equally
> > easy for the compiler to understand
> > e.g.
> > 	A compiler should be able to interpret the following three 
> > 	statemets as identical
> > 
> > (1)...	// function call syntax
> > 	if( lt(x, 0), assign(y, 0), assign(y, x) )
> > 
> > (2)...	// C syntax
> > 	if (x<0)
> > 	{	y = 0;
> > 	}
> > 	else
> > 	{	y = x;
> > 	}
> > 
> > (3)...	// verbose syntax
> > 	if x < 0 then
> > 		y = 0
> > 	else
> > 		y = x
> > 	endif
> 
> I suspect that in many compilers (3) and (2) end up (as an intermediate
> representation) in something that's equivalent to (1). But even if not,
> it shouldn't be too difficult to make all three end up in the same,
> whatever the compiler's internal representation of this statement is. 

Ok so I write a seperate parser for all three and internally they all 
produce:

	if_statement
	.. expr
	.. .. lt
	.. .. .. x
	.. .. .. 0
	.. statement
	.. .. assign
	.. .. .. y
	.. .. .. 0
	.. statement
	.. .. assign
	.. .. .. y
	.. .. .. x

So what advantage does this give me?

If instead of the keyword 'if' I used a function 'xyz' thus:

	xyz( lt(x, 0), assign(y, 0), assign(y, x) )

my parsers would all now produce:

	expr
	.. func
	.. .. xyz
	.. .. expr
	.. .. .. lt
	.. .. .. .. x
	.. .. .. .. 0
	.. .. expr
	.. .. .. assign
	.. .. .. .. y
	.. .. .. .. 0
	.. .. expr
	.. .. .. assign
	.. .. .. .. y
	.. .. .. .. x

Internally the compiler has different sections that deal with generating 
code for 'if_statment' trees, 'statement' trees and 'expr' trees.

How would you propose that I treat 'xyz' differently while parsing and how 
would you add a specific code generator for the now special 'xyz' (you 
need to describe all this somehow)?

e.g.
the 'if_statement' code generator knows that it might have either one or 
two statements following the condition expression. It also knows that if 
the condition expression evaluates to a compile time constant that it can 
discard either the 'true' or 'false' statements. It also needs to interact 
with the code generator for the condition expression to allow that 
generator to produce efficient optimised code jumps to the 'true' or 
'false' statements (think of early out logical expressions involving '&&'
and '||'). Consider the difference between "if (cond)..." and "X=cond"

> 
> It's probably a bit more difficult to write a parser for (2) and (3)
> than it is to write one for (1), but in the great scheme of a compiler,
> I don't think that this difference is crucial. 
> 
> So, yes, I think for a compiler these three could be identical. I don't
> see that a compiler could derive any information from any of them that
> it couldn't derive from the other two.

Actually it can. Consider a long complex program made up purely of 
functions as in (1). What happens with a misplaced comma or parenthesis? 
The verbose syntax lets the compiler catch silly mistakes. Consider the 
difference between:

	if (...)
	{
		a = b;
	}

	while (...)
	{
		c = d;
	}

and
	if ... then
	endif

	while ... do
	done

Now edit the above:


	if (...)
	{
		while (...)
		{
			a = b;
		}

		c = d;
	}

and
	if ... then

		while ... do
			a = b;
		endif

		c = d;
	done


> 
> Provided, of course, that the functions used in (1) are just as defined
> as the operators and statements used in (2) and (3).

Ok so I'll give you that, all the functions are defined in exactly the 
same way in (1) as they are in (2) and (3). But what are we going to do 
about the vast number of functions defined in the /standard/ library?

 > 
> > AND because the compiler should be able to "understand" functions as
> > easily as other language statements that it is a convenient way to
> > extend the language. 
> 
> Yes, extend or customize. That's approximately the C++ standard library
> way. 

But the C++ way is horrible! You have CLASS upon CLASS upon CLASS. If you 
want to write a modest program you end up so deep in 'standard' classes 
and templates that it gets very hard to see the wood for the trees.

This nonsense that user classes should be written in such a way as to have 
special methods that the standard libraries expect (things like iterators) 
so that the items in a container can be accessed. The programmer shouldn't 
need to know about all this. He should be able to just say (e.g.)

	for all items in list FRED do

		*.x = $.x + 1
	done

> > Could you please confirm this. 
> 
> Basically confirmed. 

Friendly Regards
Sergio Masci
-- 
http://www.piclist.com PIC/SX FAQ & list archive
View/change your membership options at
http://mailman.mit.edu/mailman/listinfo/piclist