ini: create separate log handlers for color and null, and add comments hinting how they can be used
Let development.ini use color for the root logger as before. The special effect of color_sql was not visible with the default sqlalchemy log level of WARN, so just use color there as well.
clone_url: simplify the logic - move summary handling of different URLs with/without id to db
Drop the half-baked concept of a separate "clone_uri_by_id" - just always use a single template and change back and forth between {repo} and _{repoid} as necessary.
Make it clear that clone_uri_tmpl always is passed as argument.
clone_url: always pass a clone_uri_tmpl, with Repository.DEFAULT_CLONE_URI as last resort
clone_url() had a layering violation of using c.clone_uri_tmpl . This refactoring now makes it clear that this only was used from PullRequest.__json__(), so move the hack there and simplify it.
tests: introduce doctest_mock_ugettext to allow doctests of localized code
Future doctests will require some extra mocking, as the code-under-test uses translation (ugettext aka '_') and its provider TurboGears2 needs a context.
Avoid this complexity by mocking ugettext as the identity function. This is done by providing a pytest fixture 'doctest_mock_ugettext' that will mock uggettext in the module that uses the fixture.
locale: fix environment checks: LC_ALL has precedence over LC_CTYPE
'man 7 locale' describes following precedence:
If the second argument to setlocale(3) is an empty string, "", for the default locale, it is determined using the following steps:
1. If there is a non-null environment variable LC_ALL, the value of LC_ALL is used.
2. If an environment variable with the same name as one of the categories above exists and is non-null, its value is used for that category.
3. If there is a non-null environment variable LANG, the value of LANG is used.
So, if LC_ALL is set, it is used regardless of LC_CTYPE or LANG. Hence, when suggesting users what to change when their locale is invalid, we should also first check LC_ALL instead of LC_CTYPE.
The formencode version range included both 1.2.x and 1.3.x releases. However, since 1.3.0, _to_python and validate_python are deprecated and renamed to _convert_to_python and _validate_python, respectively.
With current pytest, these (long) deprecation warnings are shown in the test logs.
There are two options: - restrict maximum version to 1.2.x - bump minimum version to 1.3.x
In this commit we choose the latter approach, going towards the future rather than the past.
Upgrade notes for these libraries have not been investigated thoroughly, but testing seems to show that it works. We are also early in the development phase, so big problems will be caught by general testing before going wide.
Note: TurboGears2 is not upgraded to 2.4 yet. That upgrade would require us to first move away from using the Pylons compatibility layer.
lib: use Python dot notation for Markdown extensions
Gets rid of:
data/env/lib/python2.7/site-packages/markdown/__init__.py:259: DeprecationWarning: Using short names for Markdown's builtin extensions is deprecated. Use the full path to the extension with Python's dot notation (eg: "markdown.extensions.codehilite" instead of "codehilite"). The current behavior will raise an error in version 2.7. See the Release Notes for Python-Markdown version 2.6 for more info. DeprecationWarning)
config: change default .ini to always include trace_errors settings and thus avoid deprecation warnings
Gets rid of:
data/env/lib/python2.7/site-packages/tg/configuration/app_config.py:473 data/env/lib/python2.7/site-packages/tg/configuration/app_config.py:473: DeprecationWarning: direct usage of error tracing options has been deprecated, please specify them as trace_errors.option_name instad of directly setting option_name. EXAMPLE: trace_errors.error_email "setting option_name. EXAMPLE: trace_errors.error_email", DeprecationWarning)
Note: latest pytest versions has an undeclared dependency / incompatibility with pytest-benchmark, which thus has to be bumped at the same time:
INTERNALERROR> Traceback (most recent call last): INTERNALERROR> File "data/env/lib/python2.7/site-packages/_pytest/main.py", line 202, in wrap_session INTERNALERROR> config._do_configure() INTERNALERROR> File "data/env/lib/python2.7/site-packages/_pytest/config/__init__.py", line 671, in _do_configure INTERNALERROR> self.hook.pytest_configure.call_historic(kwargs=dict(config=self)) INTERNALERROR> File "data/env/lib/python2.7/site-packages/pluggy/hooks.py", line 311, in call_historic INTERNALERROR> res = self._hookexec(self, self.get_hookimpls(), kwargs) INTERNALERROR> File "data/env/lib/python2.7/site-packages/pluggy/manager.py", line 87, in _hookexec INTERNALERROR> return self._inner_hookexec(hook, methods, kwargs) INTERNALERROR> File "data/env/lib/python2.7/site-packages/pluggy/manager.py", line 81, in <lambda> INTERNALERROR> firstresult=hook.spec.opts.get("firstresult") if hook.spec else False, INTERNALERROR> File "data/env/lib/python2.7/site-packages/pluggy/callers.py", line 208, in _multicall INTERNALERROR> return outcome.get_result() INTERNALERROR> File "data/env/lib/python2.7/site-packages/pluggy/callers.py", line 81, in get_result INTERNALERROR> _reraise(*ex) # noqa INTERNALERROR> File "data/env/lib/python2.7/site-packages/pluggy/callers.py", line 187, in _multicall INTERNALERROR> res = hook_impl.function(*args) INTERNALERROR> File "data/env/lib/python2.7/site-packages/pytest_benchmark/plugin.py", line 427, in pytest_configure INTERNALERROR> bs = config._benchmarksession = BenchmarkSession(config) INTERNALERROR> File "data/env/lib/python2.7/site-packages/pytest_benchmark/session.py", line 31, in __init__ INTERNALERROR> self.logger = Logger(self.verbose, config) INTERNALERROR> File "data/env/lib/python2.7/site-packages/pytest_benchmark/logger.py", line 15, in __init__ INTERNALERROR> self.pytest_warn = config.warn INTERNALERROR> AttributeError: 'Config' object has no attribute 'warn'
The new py.test will show deprecation warnings from other libraries and how we use them:
kallithea/tests/__init__.py:28 kallithea/tests/__init__.py:28: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: kallithea.tests pytest.register_assert_rewrite('kallithea.tests')
data/env/lib/python2.7/site-packages/pkg_resources/__init__.py:1145 kallithea/tests/api/test_api_git.py::TestGitApi::test_api_wrong_key data/env/lib/python2.7/site-packages/pkg_resources/__init__.py:1145: DeprecationWarning: Use of .. or absolute path in a resource path is not allowed and will raise exceptions in a future release. self, resource_name
<string>:2 <string>:2: SADeprecationWarning: Mapper.order_by is deprecated.Use Query.order_by() in order to affect the ordering of ORM result sets.
data/env/lib/python2.7/site-packages/tg/configuration/app_config.py:473 data/env/lib/python2.7/site-packages/tg/configuration/app_config.py:473: DeprecationWarning: direct usage of error tracing options has been deprecated, please specify them as trace_errors.option_name instad of directly setting option_name. EXAMPLE: trace_errors.error_email "setting option_name. EXAMPLE: trace_errors.error_email", DeprecationWarning)
data/env/lib/python2.7/site-packages/tg/wsgiapp.py:68 data/env/lib/python2.7/site-packages/tg/wsgiapp.py:68: DeprecationWarning: Session options should start with session. instead of baker.session. app_wrapper = wrapper(self.wrapped_dispatch, self.config)
... kallithea/model/validators.py:279: DeprecationWarning: validate_python is deprecated; use _validate_python instead class _validator(formencode.validators.FancyValidator):
... kallithea/model/validators.py:793: DeprecationWarning: _to_python is deprecated; use _convert_to_python instead class _validator(formencode.validators.FancyValidator):
...
kallithea/tests/other/test_doctest.py::test_doctests[kallithea.lib.markup_renderer] data/env/lib/python2.7/site-packages/markdown/__init__.py:259: DeprecationWarning: Using short names for Markdown's builtin extensions is deprecated. Use the full path to the extension with Python's dot notation (eg: "markdown.extensions.codehilite" instead of "codehilite"). The current behavior will raise an error in version 2.7. See the Release Notes for Python-Markdown version 2.6 for more info. DeprecationWarning)
kallithea/tests/other/test_doctest.py::test_doctests[kallithea.lib.markup_renderer] data/env/lib/python2.7/site-packages/markdown/__init__.py:259: DeprecationWarning: Using short names for Markdown's builtin extensions is deprecated. Use the full path to the extension with Python's dot notation (eg: "markdown.extensions.extra" instead of "extra"). The current behavior will raise an error in version 2.7. See the Release Notes for Python-Markdown version 2.6 for more info. DeprecationWarning)
pytest can run doctests as part of the standard testsuite run. See documentation at [1].
pytest will discover doctests in all python files it can find. However, some files cannot be imported directly in this manner. Fix this by adding a filter in conftest.py in the root directory. This code could also live in kallithea/conftest.py, but it cannot be in kallithea/tests/conftest.py because that level is deeper than the files we want to filter.
scm: don't try to get IP address from web request in model
Remove a layering violation and make functions more reusable when they no longer depend on global state.
At this level, the IP address (and information about the current user) is only used for hooks logging push / pull operations. Arguably, IP address logging only belongs in an HTTP access log, not in the log of push/pull operations. But as long as we have IP addresses in the logs, we have to provide it. The (good?) alternative would be to drop IP address from the push / pull logs ...
The parameter might be conceptually nice, but it was only available for 2 hooks. To be really useful, it should have been available everywhere. It also only reported the URL from the web request that initiated the hook ... and as such it does some layering violations. The user URL might be the address that should be used internally from the hook. And it can conceptually not be made available actions that doesn't originate from a user web request. It seems much better that custom hooks configure what they want to do. Perhaps by reading the .ini file and using canonical_url.
middleware: remove access fallback to reuse previous access - drop _git_stored_op
Before, the previous action was kept in the global controller instance. That was conceptually wrong. The previous request might be entirely unrelated, coming from another user.
It was mainly used for 'info/refs' commands ... but even more, that will be the first command that is sent, giving nothing relevant to reuse.
Fortunately, with handling of 'info/refs', we no longer seem to need it.
The fallback for unknown commands with unknown 'action' is now to return a HTTP failure, like we do for Mercurial.
middleware: fix handling of Git 'info/refs' command to give correct access control
For a pull, the Git client first sends an 'info/refs' command with a 'service=git-upload-pack' query, then it sends the actual 'git-upload-pack' command.
For a push, the Git client first sends an 'info/refs' command with a 'service=git-receive-pack' query, then it sends the actual 'git-receive-pack' command.
Before, the 'info/refs' commands would fall back to the default of trying to use the action of the previous request. That seems wrong.
Instead, authorize the 'info/refs' command just like the actual command it references.
path_info will now be checked more than before. Mainly because that is more correct and more explicit and "better" to do it that way. It might also give some safety.
middleware: move handling of permanent repo URLs to separate middleware
This is about the handling of repo URLs like '_123' for the repo with repo_id 123. The de-mangling of such URLs was spread out across multiple layers. It fits much more nicely as a middleware layer. The code in routing and simplehg / simplegit can thus be removed.
The base _get_by_id function was confusing - fix it by removing it. To do that, refactor utils introducing fix_repo_id_name to replace get_repo_by_id.
We now assume in the application that we never have any extra leading '/' in URL paths.
And while trailing extra '/' might be fine in actual URLs, they must be handled at the routing level, not propagated through all layers. This changeset is not really changing that.
Back out ccbdff90e5a0. That seemed like an odd hack. In order to work properly, it should not only be applied for protocol access middleware, but also for web UI and for commands. So evidently, it is not really necessary.
The problem it describes is fixed much better in 5e501b6ee639 by setting the right python executable in the hook scripts, further improved in 1bafb2d07709 and 6df08d78f8e7 to *actually* use the right python executable.
hg: prepare for Mercurial 5.0 changing "exact" arguments
In the backward compat wrapper, we use root=None. That might seem a bit risky. But it seems to work for the single use case we have, and the changeset dropped it in Mercurial https://www.mercurial-scm.org/repo/hg/rev/0531dff73d0b hint that this parameter really is unused.
It would be a a stepping stone for the migration to Python 3 to only support Python 2.7. Even though we don't make any big changes now, it might allow us to remove some workarounds or use some new forward-compatible features.
Mercurial dropped support for Python 2.6 2 years ago.
auth: move CSRF checks from the optional LoginRequired to the more basic BaseController._before
_before is not called for the CSRF-immune JSON-API controller and is thus a good place to check CSRF. This also apply CSRF protection to the login controller.
The flag for needing CSRF checking is stored in the thread global request object when passed from __call__ to _before for regular controllers. It is thus also set for requests to the JSON-RPC controller, but not used.
auth: simplify API key auth - move it out of _determine_auth_user
This gives less of the special handling of API key auth in LoginRequired ... but we still need to disable the LoginRequired CSRF protection for API key auth.
tests: prepare for adding CSRF protection on login forms
CSRF is about avoiding abuse of credentials by doing things in existing sessions. The login form does not have any previous credentials, so there is nothing to abuse and no real need for CSRF protection. But there is still an unauth session, so we *can* have CSRF protection.
CSRF protection is currently in LoginRequired (which obviously isn't applied to the login form), but let's prepare for changing that.
tests: make test_admin_users user_and_repo_group_fail() fixture more stable
When adding authentication_token() to log_user(), database session lifetime will in some cases change:
test_admin_users test_delete_repo_group_err() use the user_and_repo_group_fail() fixture.
Before, it got ObjectDeletedError when trying to delete a deleted RepoGroup and moved on.
After changing log_user(), py.test would emit a warning:
kallithea/tests/functional/test_admin_users.py::TestAdminUsersController::()::test_delete_repo_group_err .../site-packages/sqlalchemy/orm/persistence.py:1340: SAWarning: DELETE statement on table 'groups' expected to delete 1 row(s); 0 were matched. Please set confirm_deleted_rows=False within the mapper configuration to prevent this warning. % (table.description, expected, rows_matched)
Instead, use RepoGroup.get_by_group_name to verify the group exists before trying to delete it.
It seems like other ways of tracking authentication state are better. AuthUser is a *potentially* authenticated user. We prefer to keep it as that, without modifying the AuthUser object if the user actually should be authenticated.
The primariy indicator that a user is authenticated is when the AuthUser is set as request.authuser .
(Alternatively, we could create an AuthenticatedUser sub-class and move things like access control checks there. That would help ensuring it is used correctly, without having to check an is_authenticated flag.)
auth: drop "multiple_counter" from computing permissions
This seems to have been something about having some permissions override existing permissions. It is not clear to me why anybody should want that.
test_user_group_permissions_on_repo_groups.py seems to have been testing for something we don't want. The new behaviour seems more reasonable. The test user is inhering access from the default user, and thus in this case getting read access (except when private).
auth: minor refactoring of computation of admin access for repo owners
Make the flow slightly simpler ... and now when permissions are merged, we only have to set repo owner access once.
BUT: because multiple_counter, we actually don't merge permissions in all cases. This will thus introduce a regression that will be fixed in next changeset.
auth: explicit user permission should not blindly overrule permissions through user groups
Before, explicit permissions of a user could shadow higher permissions that would otherwise be obtained through a group the user is member of. That was confusing and fragile: *removing* a permission could then suddenly give a user *more* permissions.
Instead, change the flag for controlling internal permission computation to *not* use "explicit". Permissions will then add up, no matter if they are explicit or through groups.
The change in auth.py is small, but read the body of __get_perms to see the actual impact ... and also the clean-up changeset that will come next.
This might in some cases be a behaviour change and give users more access ... but it will probably only give the user that was intended. This change can thus be seen as a bugfix.
Some tests assumed the old behaviour. Not for good reasons, but just because that is how they were written. These tests are updated to expect the new behaviour, and it has been reviewed that it makes sense.
Note that this 'explicit' flag mostly is for repo permissions and independent of the 'user_inherit_default_permissions' that just was removed and is about global permissions.
auth: global permissions given to the default user are the bare minimum and should apply to *all* other users too
Drop the "subtractive permission" config option "inherit_from_default" that when set to false would give users less global permissions than the default unauthenticated user.
Instead, think positive and merge all positive permissions.
At the end, filter the global permissions to make sure we for each kind of permissions only keep the one with most weight.
The core of the functionality is to process a list of "raw id"s, log them, and update / invalidate caches.
handle_git_post_receive and scm _handle_push already provide that list directly. Things get much simpler when introducing a new function (process_pushed_raw_ids) just for processing pushed raw ids. That also makes it clear that scm _handle_push doesn't need any repo.
log_push_action remains the native entry point for the Mercurial hook. It was not entirely correct using 'node:tip' - after Mercurial 3.7 and d6d3cf5fda6f, it should be 'node:node_last'.
After several trivial refactorings, it turns out that the logic for creating the hash list for Mercurial actually is very simple ...
locking: drop the pull-to-lock / push-to-unlock functionality
The feature is not worth the maintenance cost. The locking is too coarse and unflexible with insufficient UI and UX. The implementation is also quite invasive in tricky areas of the code, and thus high maintenance. Dropping this will enable other cleanup ... or at least make it easier.
Sometimes, test_delete_ip would fail because UserIpMap entries left behind. It was perhaps because a commit was missing and sessions thus sometimes were leaked?