coptic cross
Moheb's Coptic Pages

Converting CS Text files into Unicode

Quick links:

Converter I very simple, only for text files, swaps also combining marks

Converter II text files and clipboard, map is editable with Excel

SIL Converter professional although free, also for MS Word


Converter I

I have written a small utility that will convert older Coptic text files into Unicode text file. In fact, before the Unicode was established, there were no ASCII or ISO8859-X standard for Coptic keymapping. Every Coptic font had defined its own encoding. Few years ago, two active members of the Remenkimi Group (Coptic Community) started defining what they called "Coptic Fonts Standard". They have swapped all famous Coptic fonts known by that time to a unified keyboard mapping. Their standard along with 7 into CS standard converted fonts can be downloaded at: The Coptic Orthodoxe Network.

The converter I have written will convert text files written in this CS Standard into Coptic Unicode. Currently UTF-8 and UTF-16 are supported. Simply start the tool, select the name of the file you want to convert, and the name of a new file for writting the output Unicode text. After that you can open the generated Unicode file with any editor that supports Unicode. Don't forget to select a Coptic Unicode font.

Notice: Entering "Jinkim" (or other combining diacritical mark like overline) in CS is differnt from Unicode. In CS, the letter is typed BEFORE the Jinkim. In Unicode one has to enter the combining mark AFTER entering the letter. The converter takes care of this problem.

Another imptant Note: Jinkim and other combining diacritical mark will be swapped to appropriate code points in the chart 0300-036 as defined by the The Unicode Consortium.
For example the Jinkim (which is ASCII 0x60 in CS) will be converted to U+003D. The complete Conversion table is available at this link.


Download CS Converter for Windows

Download the conversion Table as pdf-file
Download test files: CS input test file and the corresponding UTF-8 output file
CS Converter



Converter II


There is another converter available at the Danacbe Forum of the Coptic language.
It can support the following:
  • CS Font to Unicode
  • Unicode to CS Font
  • Beshoy (Pishoi) To Unicode
  • Unicode to Beshoy
Danacbe Converter

You can get it following (this link).
Unfortunatly, this converter covers mainly the Coptic upper and lower cases. But the mapping table is editable with Excel. It also doesn't swap the combining diacritical mark.


SIL Converter (MS Word)



SIL International has published a collection of professional tools for converting to and from different Byte-based encodings and Unicode. Among these tools is also a VB macro that will be installed in MS Word. The complete package is called SIL-Converters and can be downloaded at: http://scripts.sil.org. After installing, you will need in addition a converting map that I have created, which defines the conversion rules between the CS encoding and Unicode. Just download this file: cs2unicode.tec and remember the place where you have saved it.

Start Microsoft Word, open a file that is encoded in one of the CS fonts, you can use this sample file.
From the menu "Extras" choose the macro "Data conversion" as shown below: SILConverter step1
A new dialog will pop up.Choose "Select..." another pop up window will open:
SILConverter step2


Click the button "Add new" and from the list that will pop up select the last type: "TECkit map" and confirm with the button "Add". Then a new window with the Tecmap parameters will pop up:


SILConverter step3

Choose the folder "Setup". Set up the location of the TECkit file "cs2unicode.tec" you just downloaded from above. After that press the button "Save in Repository". You can now enter a user-friendly name, for example: "CS <> Unicode".

SILConverter step4

Close all these pop up windows. When you return back to the first window of the Converter macro, you will see that this converter is now selected. You can now convert the whole document, parts of it, or only blocks with certain font. Just experiment with this in cs font written sample doc-file, try to convert it to Unicode. You shall get a file like this one.. Further you can export the file into ODF (Open Document Format) by using the Sun ODF Plugin for Microsoft Office. The odt-file can then be opened with OpenOffice and the good thing is: In OpenOffice you can get an extension for spell checking Coptic!

good luck!



last updated: 28.01.2009
Moheb Mekhaiel