Previous Thread
Next Thread
Print Thread
Print: cyrillic #34256 12 May 21 09:41 AM
Joined: Jun 2001
Posts: 3,376
J
Jorge Tavares - UmZero Online Content OP
Member
OP Online Content
Member
J
Joined: Jun 2001
Posts: 3,376
Good morning Jack,

Is it supposed that cyrillic text copied into an XTREE get printed correctly using GDI (textout)?


Jorge Tavares

UmZero - SoftwareHouse
Brasil/Portugal
Re: Print: cyrillic [Re: Jorge Tavares - UmZero] #34257 12 May 21 03:20 PM
Joined: Jun 2001
Posts: 11,645
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,645
Hi Jorge,

I guess it is supposed to print correctly, but as is often the case, there may be a gap between what we suppose and what actually is.

Although presumably you have your own XTREE with UTF8 characters, for testing purposes we can always use the XTUTF8.BP and accompanying XTUTF8.CSV data file which contains a nice variety of character sets, including Cyrillic, Japanese, and Greek.

The first problem I notice is that the integrated export to XLSX function doesn't handle them, but that's an issue with CSV2XL, not XTREE or GDI printing. (Let me review that and get back to you.)

But you mention //TEXTOUT, which suggests that you are somehow converting your XTREE data into //TEXTOUT statements? GDI printing in general, and //TEXTOUT specifically, does support Unicode, but you have to encode any non-ANSI characters using the &#Xhhhh notation (as described in Printing Special Characters). That would obviously be a pain in the neck to do manually, but fortunately there is an SOSLIB function Fn'UTF8toGDI$(utf8string$) that will do the dirty work for you.

More generally we might ask why we can't just switch an entire text file over from from ANSI to UTF8. The main reason, I think, is that we have all kinds of text files and no standard framework (like HTML or XML headers) to identify to the application or A-Shell what the encodings are. We could possibly try to support the BOM (Byte Order Mark) technique, but again, because there have never been any restrictions or assumptions about the contents of a text file, trying to interpret the file contents based on the first 2 characters runs a risk of mis-identification. And that's before we even start to think about the complexities of inserting the UTF8 (or Unicode) translation logic into every routine that processes text characters, from GET.SBR, to INPUT and everything in between. As a practical matter, we probably need to carve out specific areas where the encodings can be unambiguously identified and constrained within reasonable limits (as was done with XTREE). I'm open to suggestions...

Re: Print: cyrillic [Re: Jorge Tavares - UmZero] #34258 12 May 21 03:27 PM
Joined: Jun 2001
Posts: 3,376
J
Jorge Tavares - UmZero Online Content OP
Member
OP Online Content
Member
J
Joined: Jun 2001
Posts: 3,376
Hi Jack,

No need to open to suggestions, I think Fn'UTF8toGDI$(utf8string$) will do the trick.
Even if the XTREE is fully UTF8, it will not be rare to find cases where some columns are and others don't so, handling the print by column sending the data through that function will be just perfect.
I'll let you know if I find any diffculty and thank you for the tip.


Jorge Tavares

UmZero - SoftwareHouse
Brasil/Portugal
Re: Print: cyrillic [Re: Jorge Tavares - UmZero] #34264 12 May 21 06:11 PM
Joined: Jun 2001
Posts: 3,376
J
Jorge Tavares - UmZero Online Content OP
Member
OP Online Content
Member
J
Joined: Jun 2001
Posts: 3,376
Jack,
Just to confirm that this worked like a charm. cool
I don't know if you are aware that //TEXTCENTER doesn't work but //TEXTRECTANGLE does the job so, at least in my case, that's not an issue.

Thanks a lot.


Jorge Tavares

UmZero - SoftwareHouse
Brasil/Portugal
Re: Print: cyrillic [Re: Jorge Tavares - UmZero] #34266 13 May 21 12:34 AM
Joined: Jun 2001
Posts: 11,645
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,645
Doh! Thanks for the heads-up on //TEXTCENTER! It was calculating the positioning before converting the entity references, resulting coordinates that were completely off them map. It's fixed in 6.5.1702.7 (which I'll post as soon as I resolve the maxrecs issue.)

Re: Print: cyrillic [Re: Jorge Tavares - UmZero] #34302 18 May 21 08:53 PM
Joined: Jun 2001
Posts: 11,645
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,645
The //TEXTCENTER problem with &#hxxxx entity references (for UTF8 or Unicode characters) should be resolved here...

ash-6.5.1703.0-w32-upd.zip
ash-6.5.1703.0-w32c-upd.zip
ash65notes.txt

Re: Print: cyrillic [Re: Jorge Tavares - UmZero] #34306 20 May 21 01:26 PM
Joined: Jun 2001
Posts: 3,376
J
Jorge Tavares - UmZero Online Content OP
Member
OP Online Content
Member
J
Joined: Jun 2001
Posts: 3,376
Confirmed the fix, thank you very much


Jorge Tavares

UmZero - SoftwareHouse
Brasil/Portugal

Moderated by  Jack McGregor, Ty Griffin 

Powered by UBB.threads™ PHP Forum Software 7.7.3