Changeset - 9937ae52f167
[Not reviewed]
default
0 4 0
Mads Kiilerich - 7 years ago 2018-06-09 16:28:33
mads@kiilerich.com
hg: set encoding to utf-8 by default to always show unicode characters correctly

Unicode characters would be shown as '?' if Kallithea was launched in a LANG=C
environment (or similar).

The problem could be solved by setting HGENCODING before launching Kallithea or
before importing Mercurial. These are often not good solutions.

Instead, introduce a hgencoding config setting that triggers monkey patching of
Mercurial.
4 files changed with 15 insertions and 1 deletions:
0 comments (0 inline, 0 general)
development.ini
Show inline comments
 
@@ -151,24 +151,27 @@ gist_alias_url =
 
## Syntax is <ControllerClass>:<function>. Check debug logs for generated names
 
## Recommended settings below are commented out:
 
api_access_controllers_whitelist =
 
#    ChangesetController:changeset_patch,
 
#    ChangesetController:changeset_raw,
 
#    FilesController:raw,
 
#    FilesController:archivefile
 

	
 
## default encoding used to convert from and to unicode
 
## can be also a comma separated list of encoding in case of mixed encodings
 
default_encoding = utf8
 

	
 
## Set Mercurial encoding, similar to setting HGENCODING before launching Kallithea
 
hgencoding = utf-8
 

	
 
## issue tracker for Kallithea (leave blank to disable, absent for default)
 
#bugtracker = https://bitbucket.org/conservancy/kallithea/issues
 

	
 
## issue tracking mapping for commit messages, comments, PR descriptions, ...
 
## Refer to the documentation ("Integration with issue trackers") for more details.
 

	
 
## regular expression to match issue references
 
## This pattern may/should contain parenthesized groups, that can
 
## be referred to in issue_server_link or issue_sub using Python backreferences
 
## (e.g. \1, \2, ...). You can also create named groups with '(?P<groupname>)'.
 
## To require mandatory whitespace before the issue pattern, use:
 
## (?:^|(?<=\s)) before the actual pattern, and for mandatory whitespace
docs/setup.rst
Show inline comments
 
@@ -624,24 +624,27 @@ can be found in ``kallithea.lib.hooks``.
 

	
 

	
 
Changing default encoding
 
-------------------------
 

	
 
By default, Kallithea uses UTF-8 encoding.
 
This is configurable as ``default_encoding`` in the .ini file.
 
This affects many parts in Kallithea including user names, filenames, and
 
encoding of commit messages. In addition Kallithea can detect if the ``chardet``
 
library is installed. If ``chardet`` is detected Kallithea will fallback to it
 
when there are encode/decode errors.
 

	
 
The Mercurial encoding is configurable as ``hgencoding``. It is similar to
 
setting the ``HGENCODING`` environment variable, but will override it.
 

	
 

	
 
Celery configuration
 
--------------------
 

	
 
Kallithea can use the distributed task queue system Celery_ to run tasks like
 
cloning repositories or sending emails.
 

	
 
Kallithea will in most setups work perfectly fine out of the box (without
 
Celery), executing all tasks in the web server process. Some tasks can however
 
take some time to run and it can be better to run such tasks asynchronously in
 
a separate process so the web server can focus on serving web requests.
 

	
 
@@ -885,25 +888,24 @@ Or if using a dispatcher WSGI script wit
 
    WSGIPassAuthorization On
 

	
 
Apache will by default run as a special Apache user, on Linux systems
 
usually ``www-data`` or ``apache``. If you need to have the repositories
 
directory owned by a different user, use the user and group options to
 
WSGIDaemonProcess to set the name of the user and group.
 

	
 
Example WSGI dispatch script:
 

	
 
.. code-block:: python
 

	
 
    import os
 
    os.environ["HGENCODING"] = "UTF-8"
 
    os.environ['PYTHON_EGG_CACHE'] = '/srv/kallithea/.egg-cache'
 

	
 
    # sometimes it's needed to set the current dir
 
    os.chdir('/srv/kallithea/')
 

	
 
    import site
 
    site.addsitedir("/srv/kallithea/venv/lib/python2.7/site-packages")
 

	
 
    ini = '/srv/kallithea/my.ini'
 
    from paste.script.util.logging_config import fileConfig
 
    fileConfig(ini)
 
    from paste.deploy import loadapp
kallithea/config/app_cfg.py
Show inline comments
 
@@ -19,24 +19,25 @@ This file complements the .ini file.
 

	
 
import platform
 
import os, sys, logging
 

	
 
import tg
 
from tg import hooks
 
from tg.configuration import AppConfig
 
from tg.support.converters import asbool
 
import alembic.config
 
from alembic.script.base import ScriptDirectory
 
from alembic.migration import MigrationContext
 
from sqlalchemy import create_engine
 
import mercurial
 

	
 
from kallithea.lib.middleware.https_fixup import HttpsFixup
 
from kallithea.lib.middleware.simplegit import SimpleGit
 
from kallithea.lib.middleware.simplehg import SimpleHg
 
from kallithea.lib.auth import set_available_permissions
 
from kallithea.lib.utils import load_rcextensions, make_ui, set_app_settings, set_vcs_config, \
 
    set_indexer_config, check_git_version, repo2db_mapper
 
from kallithea.lib.utils2 import str2bool
 
import kallithea.model.base
 
from kallithea.model.scm import ScmModel
 

	
 
import formencode
 
@@ -110,24 +111,29 @@ try:
 
    from tgext.debugbar import enable_debugbar
 
    import kajiki # only to check its existence
 
except ImportError:
 
    pass
 
else:
 
    base_config['renderers'].append('kajiki')
 
    enable_debugbar(base_config)
 

	
 

	
 
def setup_configuration(app):
 
    config = app.config
 

	
 
    # Mercurial sets encoding at module import time, so we have to monkey patch it
 
    hgencoding = config.get('hgencoding')
 
    if hgencoding:
 
        mercurial.encoding.encoding = hgencoding
 

	
 
    if config.get('ignore_alembic_revision', False):
 
        log.warn('database alembic revision checking is disabled')
 
    else:
 
        dbconf = config['sqlalchemy.url']
 
        alembic_cfg = alembic.config.Config()
 
        alembic_cfg.set_main_option('script_location', 'kallithea:alembic')
 
        alembic_cfg.set_main_option('sqlalchemy.url', dbconf)
 
        script_dir = ScriptDirectory.from_config(alembic_cfg)
 
        available_heads = sorted(script_dir.get_heads())
 

	
 
        engine = create_engine(dbconf)
 
        with engine.connect() as conn:
kallithea/lib/paster_commands/template.ini.mako
Show inline comments
 
@@ -245,24 +245,27 @@ gist_alias_url =
 
<%text>## Syntax is <ControllerClass>:<function>. Check debug logs for generated names</%text>
 
<%text>## Recommended settings below are commented out:</%text>
 
api_access_controllers_whitelist =
 
#    ChangesetController:changeset_patch,
 
#    ChangesetController:changeset_raw,
 
#    FilesController:raw,
 
#    FilesController:archivefile
 

	
 
<%text>## default encoding used to convert from and to unicode</%text>
 
<%text>## can be also a comma separated list of encoding in case of mixed encodings</%text>
 
default_encoding = utf8
 

	
 
<%text>## Set Mercurial encoding, similar to setting HGENCODING before launching Kallithea</%text>
 
hgencoding = utf-8
 

	
 
<%text>## issue tracker for Kallithea (leave blank to disable, absent for default)</%text>
 
#bugtracker = https://bitbucket.org/conservancy/kallithea/issues
 

	
 
<%text>## issue tracking mapping for commit messages, comments, PR descriptions, ...</%text>
 
<%text>## Refer to the documentation ("Integration with issue trackers") for more details.</%text>
 

	
 
<%text>## regular expression to match issue references</%text>
 
<%text>## This pattern may/should contain parenthesized groups, that can</%text>
 
<%text>## be referred to in issue_server_link or issue_sub using Python backreferences</%text>
 
<%text>## (e.g. \1, \2, ...). You can also create named groups with '(?P<groupname>)'.</%text>
 
<%text>## To require mandatory whitespace before the issue pattern, use:</%text>
 
<%text>## (?:^|(?<=\s)) before the actual pattern, and for mandatory whitespace</%text>
0 comments (0 inline, 0 general)