Sunday, March 9, 2008

A Unicode chart generator

Written by
Xtreme Great
(for k0r0pt)

For those of you, that want a complete unicode chart, well why not just generate your own, that will have all possible unicode characters. Here goes a simple C program, that will generate an HTML file showing all unicode characters that can be.

#include<stdio.h>

int main(){
unsigned long i;
printf("<html><head><title>The complete character chart"
"<body><font size=13>");
for(i=0; i<=65535; i++)
printf("%ld: &#%ld <br>", i, i);
printf("</span></body></html>");
return 0;
}

Make this file and then execute it from the command line. While execution, just pipe out the output to an external file, say ucc.htm. This will generate a 1.16MB html file. If you don't know how to pipe out the output to a file, just leave a comment. This html file will contain all possible unicode charaters. If it shows question marks or Boxes, just change the encoding to Unicode(UTF-8), as directed in the previous post.

This however shows abnormal behavior after 8238, in the sense that the numbers are written from right to left rather than left to right. That is after 8238, comes 9328. Read that in the other direction. In many, the character is written first and the number after that. I believe this is caused due to some cr or lf or something.

Before I finish, I'd give one more advice. Don't get freaked out, if your browser stops responding. Instead wait for some time. It will get okay. It's doing that because of the huge size of the file, that it normally doesn't encounter in normal cases. Happy unicoding...

No comments: