On Thu, Jan 13, 2011 at 08:36:15AM -0500, smplx wrote: >=20 > NOTE: PIC tag added Thanks. I lost the original subject and forgot to readd it on the retype. I'm going to do some snippage. Refer to the original post for details. >=20 > On Wed, 12 Jan 2011, Byron Jeff wrote: > > My interest is how to implement these threads. Forth has several differ= ent > > types of implementations. Without going into all the nuances, there are > > three basic ways of implementing how to get to a particular subroutine: > > > > 1. An actual subroutine call. > > 2. Specifying the address of the subroutine and doing an indirect jump/= call. > > 3. Specifying a token for the subroutine. > > > > Each has their pros and cons in terms of space and speed. The design > > decisions are further complicated on a pic because of paging issues and= the > > fact that in general PICs can only execute code from program memory. > > > > So I'm trying to decide the best way to implement these threads. My > > design considerations are: > > > > 1) Threads should have the ability to execute from program memory or RA= M. > > > > 2) Threads should be able to be located in any part of program memory > > > > 3) Code space and execution efficiency should be optimized. > > > > Clearly a balancing act. > > > > Specifying just the address has advantages and disadvantages. It does h= ave > > the advantage of "running" both from program memory and RAM unchanged. = With > > the new CALLW instruction the entire program memory can be reached. And > > finally with the new indirect access map, there is a unified RAM and > > program memory address space. > > > > The challenges are primarily space and time considerations. As a compro= mise > > the indirect access map only accesses the lower byte of each program me= mory > > word. So that means that two fetches would be required to get each addr= ess. > > On the other hand, the traditional EEDATA fetch can get all 14 bits of = each > > word. But two challenges are that 14 bits only gives 16K words of acces= s > > (which is the entire 16F1939 program space) and that it takes quite a b= it > > of setup to use EEDATA to fetch both words. > > > > Now tokens have some possibilities. Tokens can be shoved into 8 bits wh= ich > > requires only a single fetch. The new BRW instruction makes it trivial = to > > implement a jump table. The biggest problem with tokens is that there's= a > > limit to the number you can implement before you have to do something a= bout > > it. > > > > I'm thinking that maybe a mix of tokens and addresses may be the winner= .. If > > tokens have bit 7 clear and address have bit 7 set, then tokens can be = used > > for heavily accessed subroutines while addresses can be used for others= .. > > > > Anyway, just some thoughts. If you have any, I'd like to hear them. >=20 > Ok, just a couple of thoughts. >=20 > Yes using a mixture of tokens and absolute addresses sounds good. However= =20 > I would forget about using a bit in the token to distinguish between the= =20 > two as this means you lose half your tokens and half your address space=20 > AND you increase the runtime overhead by having to decide which you are=20 > decoding every time. Somewhat good points. Actually the half an address space isn't lost because the architecture only has a 15 bit address space. So 7 bits from the token + 8 bits from the next fetch =3D 15 bits. Decoding is a valid issue. My thought process was the fact that to get full access to the memory space, the jump table would need 2 instruction entries (A MOVLP for the PCLATH followed by a jump/call) and BRW only can access 256 words forward of the instruction to 256/2 =3D 128 possible tokens. For 256 tokens, I would have to decode the top bit anyway to determine which jump table to access. So I figured that decoding and using the absolute address for the second half would be a win. > Instead I would propose that you always use tokens=20 > and that you reserve one of these to handle an absolute address. In effec= t=20 > all the subroutines referenced in your subroutine look-up table get calle= d=20 > immediately without worrying if the token is a token or an address and yo= u=20 > have one special subroutine that takes the next two bytes of the stack=20 > (it's 16 bit integer parameter) and uses these as the actual address of=20 > the real subroutine to call (jump to). The return address doesn't need to= =20 > be messed with and banking is easier because you are only calling=20 > subroutines via your look-up table and the special indirect subroutine=20 > handler can do the necessary banking for you. So although it does require= =20 > a little extra work because it looks like you are doing kind of two calls= ,=20 > it should on average actually be more efficient than processing each toke= n=20 > specially to see if it is a subroutine token or an absolute address. See above. The only problem I see with this approach is the size mix of tokens + absolute addresses. With my approach you have 1 byte for tokens and 2 bytes for absolute addresses. With yours there is 1 byte for tokens and 3 bytes for absolute addresses. And as alluded to above, the top bit of the token still needs to be decoded. Finally extending the token set does not seem to make much sense either since it'll take 2 bytes for extended tokens and in 2 bytes you can have access to any absolute address. The decoding should only take a single btfsx instruction as the extended 16F architecture has the WREG as a FSR. So the sequence should be something like call fetchtoken btfsc WREG,7 ; Skip if token goto absolute ; process absolute address lslf WREG,F ; double token for table jump brw ; to table branch tokentable token0 movlp HIGH(token0) ; top part of first token goto token0 ; process first token .... I think with your approach either the same code with a jump to the second half of the token table instead of absolute jump or a clear carry/rlf/test on carry would need to be done. I don't think that there is more than one instruction times worth of time one way or the other difference. It is certainly interesting analysis. BTW I have a design goal of running threads out of RAM so that temporary words can be written and tested in RAM before being committed to flash. BAJ >=20 > Regards > Sergio Masci >=20 > =3D=3D=3D >=20 > Do NOT send emails to this email address directly. This email address is= =20 > only valid for the PICLIST. Anything else gets dumpped without me ever=20 > seeing it. >=20 >=20 > --=20 > http://www.piclist.com PIC/SX FAQ & list archive > View/change your membership options at > http://mailman.mit.edu/mailman/listinfo/piclist --=20 Byron A. Jeff Department Chair: IT/CS/CNET College of Information and Mathematical Sciences Clayton State University http://cims.clayton.edu/bjeff --=20 http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist .