Why not define one (very rarely used) character code as a character set 'page swap' prefix...(let's call it CSP)   Then under the old rules, you can use the existing symbol set without touching anything...

For the new boys that aren't moving to the latest SubTitling technologies, then you send CSP and as many bytes as defined/required to designate the page you want to swap to. And then display away...
Return to your original CSP page when done.

Then you declare that page CSP + &h80 +&h 00 is the standard Roman CS, while CSP + &h80 + &h01 is somtehing else etc...

This does require heavy overhead if every second char is from a different char set, but even then, there are was to optimize the layout of the sets for maximum hit efficiency.

( I threw the high bit &h8x, so that if the CSP appears in 'normal' text messages, then there is a very small chance of the succeeding byte having the high bit set - so the CSP can be ignored automatically...)

One byte should be enough, but perhaps you want to put some other things in there as well...

Regards,
MC
Looking for good broadcast engineering projects...

Michael Coop
Fax:    +60 3 411-8260
email   mcoop@pop.jaring.my


-----Original Message-----
From:   John Payson [SMTP:supercat@MCS.COM]
Sent:   Tuesday, July 08, 1997 6:49 AM
To:     PICLIST@MITVMA.MIT.EDU
Subject:        Re: Culture (was For sale)

> Those of you working with closed captioning may have noticed the xenophobic
> character set the FCC selected at the bidding of the National Captioning
> Institute and WGBH.  Some accented characters, but not enough to do French.
>  Some dingbats but not enough to do Spanish or Portugese.  Odd.

Well, the entire closed-captioning character set can be displayed(*) using
only a 7-bit display buffer.  If there were any more characters, this
would no longer be possible (**).  Personally, I think they should have
included 128 /printable/ glyphs in the set at minimum, but I didn't design
it.  As it is, the codes divvy up as:

110 : Printable glyphs(*)
7   : Color select prefix codes
7   : Underlined color select prefix codes
2   : Italics and italics underlined prefix codes
1   : Flashing prefix code
1   : Transparent space

(*) There are actually 111 defined printable glyphs (which would up the
total to 129 different codes).  The Philips CC chip solves this problem by
mapping "O" to "0".

(**) Well, you could get one more if you mapped "l" to "1".


The things I thought were odd when working with closed captioning--much
stranger than its "xenophobia", were:

[1] No asterisk character ($2A is an "a" with an aigu)

[2] Doubling up of all the color-select codes (and all the XY prefix
commands in the command set) to support underlining, when I have only ONCE
ever seen anything delibrately underlined in a closed-caption broadcast.

[3] Bizarre handling of "gotoXY" codes.

[4] Semi-mandatory doubling of the 16 double-byte printable glyphs (which
means that each of them takes up four bytes).  In the code I wrote for
captioning, I'd only double the d-byte glyphs if they were followed by
matching codes, so two musical note characters together would appear as 6
bytes).

[5] An end-credit credit from NCI (National Captioning Institute) which
contained 5 lines of text--in direct violation of the captioning standards
(and the capabilities of some sets) which only allow for four!

Still, despite its wierdness, captioning is cool.  Too bad I never got to
unleash my caption multiplexor on an unsuspecting world--that thing was
neat.