Changeset - 94f7585af8a1
[Not reviewed]
beta
0 2 0
Marcin Kuzminski - 15 years ago 2010-12-28 18:44:32
marcin@python-works.com
fixes to #92, updated changelog
2 files changed with 28 insertions and 13 deletions:
0 comments (0 inline, 0 general)
docs/changelog.rst
Show inline comments
 
.. _changelog:
 

	
 
Changelog
 
=========
 

	
 
1.2.0 (**2010-12-18**)
 
----------------------
 

	
 
:status: in-progress
 
:branch: beta
 

	
 
news
 
++++
 

	
 
- implemented #91 added nicer looking archive urls
 
- implemented #44 into file browsing, and added follow branch option
 

	
 
fixes
 
++++
 

	
 
- fixed file browser bug, when switching into given form revision the url was 
 
  not changing
 
- fixed #92
 

	
 
1.1.0 (**2010-12-18**)
 
----------------------
 

	
 
news
 
++++
 

	
 
- rewrite of internals for vcs >=0.1.10
 
- uses mercurial 1.7 with dotencode disabled for maintaining compatibility 
 
  with older clients
 
- anonymous access, authentication via ldap
 
- performance upgrade for cached repos list - each repository has it's own 
 
  cache that's invalidated when needed.
 
- performance upgrades on repositories with large amount of commits (20K+)
 
- main page quick filter for filtering repositories
 
- user dashboards with ability to follow chosen repositories actions
 
- sends email to admin on new user registration
 
- added cache/statistics reset options into repository settings
 
- more detailed action logger (based on hooks) with pushed changesets lists
 
  and options to disable those hooks from admin panel
 
- introduced new enhanced changelog for merges that shows more accurate results
 
- new improved and faster code stats (based on pygments lexers mapping tables, 
 
  showing up to 10 trending sources for each repository. Additionally stats
 
  can be disabled in repository settings.
 
- gui optimizations, fixed application width to 1024px
 
- added cut off (for large files/changesets) limit into config files
 
- whoosh, celeryd, upgrade moved to paster command
 
- other than sqlite database backends can be used
 

	
 
fixes
 
+++++
 

	
 
- fixes #61 forked repo was showing only after cache expired
 
- fixes #76 no confirmation on user deletes
 
- fixes #66 Name field misspelled
 
- fixes #72 block user removal when he owns repositories
 
- fixes #69 added password confirmation fields
 
- fixes #87 RhodeCode crashes occasionally on updating repository owner
 
- fixes #82 broken annotations on files with more than 1 blank line at the end
 
- a lot of fixes and tweaks for file browser
 
- fixed detached session issues
 
- fixed when user had no repos he would see all repos listed in my account
 
- fixed ui() instance bug when global hgrc settings was loaded for server 
 
  instance and all hgrc options were merged with our db ui() object
 
- numerous small bugfixes
 
 
 
(special thanks for TkSoh for detailed feedback)
 

	
 

	
 
1.0.2 (**2010-11-12**)
 
----------------------
 

	
 
news
 
++++
 

	
 
- tested under python2.7
 
- bumped sqlalchemy and celery versions
 

	
 
fixes
 
+++++
 

	
 
- fixed #59 missing graph.js
 
- fixed repo_size crash when repository had broken symlinks
 
- fixed python2.5 crashes.
 

	
 

	
 
1.0.1 (**2010-11-10**)
 
----------------------
 

	
 
news
 
++++
 

	
 
- small css updated
 

	
 
fixes
 
+++++
 

	
 
- fixed #53 python2.5 incompatible enumerate calls
 
- fixed #52 disable mercurial extension for web
 
- fixed #51 deleting repositories don't delete it's dependent objects
 

	
 

	
 
1.0.0 (**2010-11-02**)
 
----------------------
 

	
 
- security bugfix simplehg wasn't checking for permissions on commands
 
  other than pull or push.
 
- fixed doubled messages after push or pull in admin journal
 
- templating and css corrections, fixed repo switcher on chrome, updated titles
 
- admin menu accessible from options menu on repository view
 
- permissions cached queries
 

	
 
1.0.0rc4  (**2010-10-12**)
 
--------------------------
 

	
 
- fixed python2.5 missing simplejson imports (thanks to Jens Bäckman)
rhodecode/lib/indexers/daemon.py
Show inline comments
 
#!/usr/bin/env python
 
# encoding: utf-8
 
# whoosh indexer daemon for rhodecode
 
# Copyright (C) 2009-2010 Marcin Kuzminski <marcin@python-works.com>
 
#
 
# -*- coding: utf-8 -*-
 
"""
 
    rhodecode.lib.indexers.daemon
 
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

	
 
    A deamon will read from task table and run tasks
 
    
 
    :created_on: Jan 26, 2010
 
    :author: marcink
 
    :copyright: (C) 2009-2010 Marcin Kuzminski <marcin@python-works.com>    
 
    :license: GPLv3, see COPYING for more details.
 
"""
 
# This program is free software; you can redistribute it and/or
 
# modify it under the terms of the GNU General Public License
 
# as published by the Free Software Foundation; version 2
 
# of the License or (at your opinion) any later version of the license.
 
# 
 
# This program is distributed in the hope that it will be useful,
 
# but WITHOUT ANY WARRANTY; without even the implied warranty of
 
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 
# GNU General Public License for more details.
 
# 
 
# You should have received a copy of the GNU General Public License
 
# along with this program; if not, write to the Free Software
 
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
 
# MA  02110-1301, USA.
 
"""
 
Created on Jan 26, 2010
 

	
 
@author: marcink
 
A deamon will read from task table and run tasks
 
"""
 
import sys
 
import os
 
import traceback
 
from os.path import dirname as dn
 
from os.path import join as jn
 

	
 
#to get the rhodecode import
 
project_path = dn(dn(dn(dn(os.path.realpath(__file__)))))
 
sys.path.append(project_path)
 

	
 

	
 
from rhodecode.model.scm import ScmModel
 
from rhodecode.lib.helpers import safe_unicode
 
from whoosh.index import create_in, open_dir
 
from shutil import rmtree
 
from rhodecode.lib.indexers import INDEX_EXTENSIONS, SCHEMA, IDX_NAME
 

	
 
from time import mktime
 
from vcs.exceptions import ChangesetError, RepositoryError
 

	
 
import logging
 

	
 
log = logging.getLogger('whooshIndexer')
 
# create logger
 
log.setLevel(logging.DEBUG)
 
log.propagate = False
 
# create console handler and set level to debug
 
ch = logging.StreamHandler()
 
ch.setLevel(logging.DEBUG)
 

	
 
# create formatter
 
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
 

	
 
# add formatter to ch
 
ch.setFormatter(formatter)
 

	
 
# add ch to logger
 
log.addHandler(ch)
 

	
 
class WhooshIndexingDaemon(object):
 
    """
 
    Deamon for atomic jobs
 
    """
 

	
 
    def __init__(self, indexname='HG_INDEX', index_location=None,
 
                 repo_location=None, sa=None):
 
        self.indexname = indexname
 

	
 
        self.index_location = index_location
 
        if not index_location:
 
            raise Exception('You have to provide index location')
 

	
 
        self.repo_location = repo_location
 
        if not repo_location:
 
            raise Exception('You have to provide repositories location')
 

	
 
        self.repo_paths = ScmModel(sa).repo_scan(self.repo_location, None)
 
        self.initial = False
 
        if not os.path.isdir(self.index_location):
 
            os.makedirs(self.index_location)
 
            log.info('Cannot run incremental index since it does not'
 
                     ' yet exist running full build')
 
            self.initial = True
 

	
 
    def get_paths(self, repo):
 
        """recursive walk in root dir and return a set of all path in that dir
 
        based on repository walk function
 
        """
 
        index_paths_ = set()
 
        try:
 
            for topnode, dirs, files in repo.walk('/', 'tip'):
 
                for f in files:
 
                    index_paths_.add(jn(repo.path, f.path))
 
                for dir in dirs:
 
                    for f in files:
 
                        index_paths_.add(jn(repo.path, f.path))
 

	
 
        except RepositoryError:
 
        except RepositoryError, e:
 
            log.debug(traceback.format_exc())
 
            pass
 
        return index_paths_
 

	
 
    def get_node(self, repo, path):
 
        n_path = path[len(repo.path) + 1:]
 
        node = repo.get_changeset().get_node(n_path)
 
        return node
 

	
 
    def get_node_mtime(self, node):
 
        return mktime(node.last_changeset.date.timetuple())
 

	
 
    def add_doc(self, writer, path, repo):
 
        """Adding doc to writer this function itself fetches data from
 
        the instance of vcs backend"""
 
        node = self.get_node(repo, path)
 

	
 
        #we just index the content of chosen files
 
        if node.extension in INDEX_EXTENSIONS:
 

	
 
            u_content = node.content
 
            if not isinstance(u_content, unicode):
 
                log.warning('  >> %s Could not get this content as unicode '
 
                          'replacing with empty content', path)
 
                u_content = u''
 
            else:
 
            log.debug('    >> %s [WITH CONTENT]' % path)
 
            u_content = node.content
 

	
 
        else:
 
            log.debug('    >> %s' % path)
 
            #just index file name without it's content
 
            u_content = u''
 

	
 
        writer.add_document(owner=unicode(repo.contact),
 
                        repository=safe_unicode(repo.name),
 
                        path=safe_unicode(path),
 
                        content=u_content,
 
                        modtime=self.get_node_mtime(node),
 
                        extension=node.extension)
 

	
 

	
 
    def build_index(self):
 
        if os.path.exists(self.index_location):
 
            log.debug('removing previous index')
 
            rmtree(self.index_location)
 

	
 
        if not os.path.exists(self.index_location):
 
            os.mkdir(self.index_location)
 

	
 
        idx = create_in(self.index_location, SCHEMA, indexname=IDX_NAME)
 
        writer = idx.writer()
 

	
 
        print self.repo_paths.values()
 
        for cnt, repo in enumerate(self.repo_paths.values()):
 
            log.debug('building index @ %s' % repo.path)
 

	
 
            for idx_path in self.get_paths(repo):
 
                self.add_doc(writer, idx_path, repo)
 

	
 
        log.debug('>> COMMITING CHANGES <<')
 
        writer.commit(merge=True)
 
        log.debug('>>> FINISHED BUILDING INDEX <<<')
 

	
 

	
 
    def update_index(self):
 
        log.debug('STARTING INCREMENTAL INDEXING UPDATE')
 

	
 
        idx = open_dir(self.index_location, indexname=self.indexname)
 
        # The set of all paths in the index
 
        indexed_paths = set()
 
        # The set of all paths we need to re-index
 
        to_index = set()
 

	
 
        reader = idx.reader()
 
        writer = idx.writer()
 

	
 
        # Loop over the stored fields in the index
 
        for fields in reader.all_stored_fields():
 
            indexed_path = fields['path']
 
            indexed_paths.add(indexed_path)
 

	
 
            repo = self.repo_paths[fields['repository']]
 

	
 
            try:
 
                node = self.get_node(repo, indexed_path)
 
            except ChangesetError:
 
                # This file was deleted since it was indexed
 
                log.debug('removing from index %s' % indexed_path)
 
                writer.delete_by_term('path', indexed_path)
 

	
 
            else:
 
                # Check if this file was changed since it was indexed
 
                indexed_time = fields['modtime']
 
                mtime = self.get_node_mtime(node)
 
                if mtime > indexed_time:
 
                    # The file has changed, delete it and add it to the list of
 
                    # files to reindex
 
                    log.debug('adding to reindex list %s' % indexed_path)
 
                    writer.delete_by_term('path', indexed_path)
 
                    to_index.add(indexed_path)
 

	
 
        # Loop over the files in the filesystem
 
        # Assume we have a function that gathers the filenames of the
 
        # documents to be indexed
 
        for repo in self.repo_paths.values():
 
            for path in self.get_paths(repo):
 
                if path in to_index or path not in indexed_paths:
 
                    # This is either a file that's changed, or a new file
 
                    # that wasn't indexed before. So index it!
 
                    self.add_doc(writer, path, repo)
 
                    log.debug('re indexing %s' % path)
 

	
 
        log.debug('>> COMMITING CHANGES <<')
 
        writer.commit(merge=True)
 
        log.debug('>>> FINISHED REBUILDING INDEX <<<')
 

	
 
    def run(self, full_index=False):
 
        """Run daemon"""
 
        if full_index or self.initial:
 
            self.build_index()
 
        else:
 
            self.update_index()
0 comments (0 inline, 0 general)