Mike Harrison says:

While we're on the subject of neat little routines etc...

This code sends a byte at 115.2Kbaud
With a 4.096MHZ xtal, error is about 1.25% (near enough),
with 4.000MHz, 4.55% (marginal).

Data is non-inverted (i.e can be fed straight to PC)

txbyte
	mov	temp, W
	mov	W, #10	; 1 start + 8 data + 1 stop
	mov	cnt, W
	clrb	C	;start bit
	mov	W, Rb
txloop
	and	W, #0ff-(1<	;<seroutbit)
	sb	C
	or	W, #1<	;<seroutbit
	mov	Rb, W
  sec
	rr	temp	; carry shifted in as stop bit
	decsz	cnt
	jmp	txloop
	ret

Incidentally It's also possible to do 115K2 bit-bashed reception at 4.096MHz, by generating an INT interrupt off the startbit edge, and grabbbing all the data bits within the int code. Context save/restore overheads mean there's only a couple of cycles available to the foreground task if data is streaming in continuously!