Regina 7.0 Calculation Engine
Encodings for international strings

Regina's calculation engine uses UTF-8 for all strings (except possibly for filenames; see below). This means that programmers who pass strings into routines must ensure that they use UTF-8, and programmers who receive strings from routines may assume that they are returned in UTF-8. Note that plain ASCII is a subset of UTF-8, so plain ASCII text is always fine to use.

Regina's XML data files are also stored using UTF-8. Very old versions of Regina used LATIN1 (the default at the time for the Qt libraries) and did not specify an encoding in the XML header; however, Regina's file I/O routines are aware of this, and will convert older data into UTF-8 as it is loaded into memory (the files themselves are of course not modified). The routine versionUsesUTF8() may be useful for programmers who need to work with older data files at a low level.

File names are a special case, since here Regina must interact with the underlying operating system. All filenames that are passed into routines must be presented in whatever encoding the operating system expects; Regina will simply pass them through to the standard C/C++ file I/O routines (such as fopen() or std::ifstream::open()) without modifying them in any way.

Python
The translation of international strings between Python and C++ should be seamless: all unicode strings passed from Python to C++ will be encoded using UTF-8, and all strings passed from C++ to Python will be assumed to be encoded in UTF-8.

Copyright © 1999-2021, The Regina development team
This software is released under the GNU General Public License, with some additional permissions; see the source code for details.
For further information, or to submit a bug or other problem, please contact Ben Burton (bab@maths.uq.edu.au).