Changeset - 93cffcb6fd54
[Not reviewed]
beta
0 2 0
Jared Bunting - 14 years ago 2011-07-02 00:46:01
jared.bunting@peachjean.com
Adding documentation for indexer's self-resolving repos location.
2 files changed with 12 insertions and 11 deletions:
0 comments (0 inline, 0 general)
docs/setup.rst
Show inline comments
 
@@ -10,220 +10,221 @@ Setting up RhodeCode
 
First, you will need to create a RhodeCode configuration file. Run the 
 
following command to do this::
 
 
 
    paster make-config RhodeCode production.ini
 

	
 
- This will create the file `production.ini` in the current directory. This
 
  configuration file contains the various settings for RhodeCode, e.g proxy 
 
  port, email settings, usage of static files, cache, celery settings and 
 
  logging.
 

	
 

	
 
Next, you need to create the databases used by RhodeCode. I recommend that you
 
use sqlite (default) or postgresql. If you choose a database other than the
 
default ensure you properly adjust the db url in your production.ini
 
configuration file to use this other database. Create the databases by running
 
the following command::
 

	
 
    paster setup-app production.ini
 

	
 
This will prompt you for a "root" path. This "root" path is the location where
 
RhodeCode will store all of its repositories on the current machine. After
 
entering this "root" path ``setup-app`` will also prompt you for a username 
 
and password for the initial admin account which ``setup-app`` sets up for you.
 

	
 
- The ``setup-app`` command will create all of the needed tables and an admin
 
  account. When choosing a root path you can either use a new empty location, 
 
  or a location which already contains existing repositories. If you choose a 
 
  location which contains existing repositories RhodeCode will simply add all 
 
  of the repositories at the chosen location to it's database. (Note: make 
 
  sure you specify the correct path to the root).
 
- Note: the given path for mercurial_ repositories **must** be write accessible
 
  for the application. It's very important since the RhodeCode web interface 
 
  will work without write access, but when trying to do a push it will 
 
  eventually fail with permission denied errors unless it has write access.
 

	
 
You are now ready to use RhodeCode, to run it simply execute::
 
 
 
    paster serve production.ini
 
 
 
- This command runs the RhodeCode server. The web app should be available at the 
 
  127.0.0.1:5000. This ip and port is configurable via the production.ini 
 
  file created in previous step
 
- Use the admin account you created above when running ``setup-app`` to login 
 
  to the web app.
 
- The default permissions on each repository is read, and the owner is admin. 
 
  Remember to update these if needed.
 
- In the admin panel you can toggle ldap, anonymous, permissions settings. As
 
  well as edit more advanced options on users and repositories
 

	
 
Try copying your own mercurial repository into the "root" directory you are
 
using, then from within the RhodeCode web application choose Admin >
 
repositories. Then choose Add New Repository. Add the repository you copied 
 
into the root. Test that you can browse your repository from within RhodeCode 
 
and then try cloning your repository from RhodeCode with::
 

	
 
    hg clone http://127.0.0.1:5000/<repository name>
 

	
 
where *repository name* is replaced by the name of your repository.
 

	
 
Using RhodeCode with SSH
 
------------------------
 

	
 
RhodeCode currently only hosts repositories using http and https. (The addition
 
of ssh hosting is a planned future feature.) However you can easily use ssh in
 
parallel with RhodeCode. (Repository access via ssh is a standard "out of
 
the box" feature of mercurial_ and you can use this to access any of the
 
repositories that RhodeCode is hosting. See PublishingRepositories_)
 

	
 
RhodeCode repository structures are kept in directories with the same name 
 
as the project. When using repository groups, each group is a subdirectory.
 
This allows you to easily use ssh for accessing repositories.
 

	
 
In order to use ssh you need to make sure that your web-server and the users 
 
login accounts have the correct permissions set on the appropriate directories.
 
(Note that these permissions are independent of any permissions you have set up
 
using the RhodeCode web interface.)
 

	
 
If your main directory (the same as set in RhodeCode settings) is for example
 
set to **/home/hg** and the repository you are using is named `rhodecode`, then
 
to clone via ssh you should run::
 

	
 
    hg clone ssh://user@server.com/home/hg/rhodecode
 

	
 
Using other external tools such as mercurial-server_ or using ssh key based
 
authentication is fully supported.
 

	
 
Note: In an advanced setup, in order for your ssh access to use the same
 
permissions as set up via the RhodeCode web interface, you can create an
 
authentication hook to connect to the rhodecode db and runs check functions for
 
permissions against that.
 
    
 
Setting up Whoosh full text search
 
----------------------------------
 

	
 
Starting from version 1.1 the whoosh index can be build by using the paster
 
command ``make-index``. To use ``make-index`` you must specify the configuration
 
file that stores the location of the index, and the location of the repositories
 
(`--repo-location`).Starting from version 1.2 it is 
 
also possible to specify a comma separated list of repositories (`--index-only`)
 
to build index only on chooses repositories skipping any other found in repos
 
location
 
file that stores the location of the index. You may specify the location of the 
 
repositories (`--repo-location`).  If not specified, this value is retrieved 
 
from the RhodeCode database.  This was required prior to 1.2.  Starting from 
 
version 1.2 it is also possible to specify a comma separated list of 
 
repositories (`--index-only`) to build index only on chooses repositories 
 
skipping any other found in repos location
 

	
 
You may optionally pass the option `-f` to enable a full index rebuild. Without
 
the `-f` option, indexing will run always in "incremental" mode.
 

	
 
For an incremental index build use::
 

	
 
	paster make-index production.ini --repo-location=<location for repos> 
 
	paster make-index production.ini 
 

	
 
For a full index rebuild use::
 

	
 
	paster make-index production.ini -f --repo-location=<location for repos>
 
	paster make-index production.ini -f 
 

	
 

	
 
building index just for chosen repositories is possible with such command::
 
 
 
 paster make-index production.ini --repo-location=<location for repos> --index-only=vcs,rhodecode
 
 paster make-index production.ini --index-only=vcs,rhodecode
 

	
 

	
 
In order to do periodical index builds and keep your index always up to date.
 
It's recommended to do a crontab entry for incremental indexing. 
 
An example entry might look like this::
 
 
 
    /path/to/python/bin/paster /path/to/rhodecode/production.ini --repo-location=<location for repos> 
 
    /path/to/python/bin/paster make-index /path/to/rhodecode/production.ini 
 
  
 
When using incremental mode (the default) whoosh will check the last
 
modification date of each file and add it to be reindexed if a newer file is
 
available. The indexing daemon checks for any removed files and removes them
 
from index.
 

	
 
If you want to rebuild index from scratch, you can use the `-f` flag as above,
 
or in the admin panel you can check `build from scratch` flag.
 

	
 

	
 
Setting up LDAP support
 
-----------------------
 

	
 
RhodeCode starting from version 1.1 supports ldap authentication. In order
 
to use LDAP, you have to install the python-ldap_ package. This package is 
 
available via pypi, so you can install it by running
 

	
 
using easy_install::
 

	
 
    easy_install python-ldap
 
 
 
using pip::
 

	
 
    pip install python-ldap
 

	
 
.. note::
 
   python-ldap requires some certain libs on your system, so before installing 
 
   it check that you have at least `openldap`, and `sasl` libraries.
 

	
 
LDAP settings are located in admin->ldap section,
 

	
 
Here's a typical ldap setup::
 

	
 
 Connection settings
 
 Enable LDAP          = checked
 
 Host                 = host.example.org
 
 Port                 = 389
 
 Account              = <account>
 
 Password             = <password>
 
 Connection Security  = LDAPS connection
 
 Certificate Checks   = DEMAND
 

	
 
 Search settings
 
 Base DN              = CN=users,DC=host,DC=example,DC=org
 
 LDAP Filter          = (&(objectClass=user)(!(objectClass=computer)))
 
 LDAP Search Scope    = SUBTREE
 

	
 
 Attribute mappings
 
 Login Attribute      = uid
 
 First Name Attribute = firstName
 
 Last Name Attribute  = lastName
 
 E-mail Attribute     = mail
 

	
 
.. _enable_ldap:
 

	
 
Enable LDAP : required
 
    Whether to use LDAP for authenticating users.
 

	
 
.. _ldap_host:
 

	
 
Host : required
 
    LDAP server hostname or IP address.
 

	
 
.. _Port:
 

	
 
Port : required
 
    389 for un-encrypted LDAP, 636 for SSL-encrypted LDAP.
 

	
 
.. _ldap_account:
 

	
 
Account : optional
 
    Only required if the LDAP server does not allow anonymous browsing of
 
    records.  This should be a special account for record browsing.  This
 
    will require `LDAP Password`_ below.
 

	
 
.. _LDAP Password:
 

	
 
Password : optional
 
    Only required if the LDAP server does not allow anonymous browsing of
 
    records.
 

	
 
.. _Enable LDAPS:
 

	
 
Connection Security : required
 
    Defines the connection to LDAP server
 

	
 
    No encryption
 
        Plain non encrypted connection
 
        
 
    LDAPS connection
 
        Enable ldaps connection. It will likely require `Port`_ to be set to 
 
        a different value (standard LDAPS port is 636). When LDAPS is enabled 
 
        then `Certificate Checks`_ is required.
 
        
 
    START_TLS on LDAP connection
 
        START TLS connection
 
@@ -463,97 +464,97 @@ http://wiki.pylonshq.com/display/pylonsc
 

	
 
Apache as subdirectory
 
----------------------
 

	
 
Apache subdirectory part::
 

	
 
    <Location /<someprefix> >
 
      ProxyPass http://127.0.0.1:5000/<someprefix>
 
      ProxyPassReverse http://127.0.0.1:5000/<someprefix>
 
      SetEnvIf X-Url-Scheme https HTTPS=1
 
    </Location> 
 

	
 
Besides the regular apache setup you will need to add the following line
 
into [app:main] section of your .ini file::
 

	
 
    filter-with = proxy-prefix
 

	
 
Add the following at the end of the .ini file::
 

	
 
    [filter:proxy-prefix]
 
    use = egg:PasteDeploy#prefix
 
    prefix = /<someprefix> 
 

	
 

	
 
then change <someprefix> into your choosen prefix
 

	
 
Apache's WSGI config
 
--------------------
 

	
 

	
 
Example wsgi dispatch script::
 

	
 
    import os
 
    os.environ["HGENCODING"] = "UTF-8"
 
    os.environ['PYTHON_EGG_CACHE'] = '/home/web/rhodecode/.egg-cache'
 
    
 
    # sometimes it's needed to set the curent dir
 
    os.chdir('/home/web/rhodecode/') 
 
    
 
    from paste.deploy import loadapp
 
    from paste.script.util.logging_config import fileConfig
 

	
 
    fileConfig('/home/web/rhodecode/production.ini')
 
    application = loadapp('config:/home/web/rhodecode/production.ini')
 

	
 

	
 
Other configuration files
 
-------------------------
 

	
 
Some example init.d scripts can be found here, for debian and gentoo:
 

	
 
https://rhodecode.org/rhodecode/files/tip/init.d
 

	
 

	
 
Troubleshooting
 
---------------
 

	
 
:Q: **Missing static files?**
 
:A: Make sure either to set the `static_files = true` in the .ini file or
 
   double check the root path for your http setup. It should point to 
 
   for example:
 
   /home/my-virtual-python/lib/python2.6/site-packages/rhodecode/public
 
   
 
| 
 

	
 
:Q: **Can't install celery/rabbitmq**
 
:A: Don't worry RhodeCode works without them too. No extra setup is required.
 

	
 
|
 
 
 
:Q: **Long lasting push timeouts?**
 
:A: Make sure you set a longer timeouts in your proxy/fcgi settings, timeouts
 
    are caused by https server and not RhodeCode.
 
    
 
| 
 

	
 
:Q: **Large pushes timeouts?**
 
:A: Make sure you set a proper max_body_size for the http server.
 

	
 
|
 

	
 
:Q: **Apache doesn't pass basicAuth on pull/push?**
 
:A: Make sure you added `WSGIPassAuthorization true`.
 

	
 
For further questions search the `Issues tracker`_, or post a message in the 
 
`google group rhodecode`_
 

	
 
.. _virtualenv: http://pypi.python.org/pypi/virtualenv
 
.. _python: http://www.python.org/
 
.. _mercurial: http://mercurial.selenic.com/
 
.. _celery: http://celeryproject.org/
 
.. _rabbitmq: http://www.rabbitmq.com/
 
.. _python-ldap: http://www.python-ldap.org/
 
.. _mercurial-server: http://www.lshift.net/mercurial-server.html
 
.. _PublishingRepositories: http://mercurial.selenic.com/wiki/PublishingRepositories
 
.. _Issues tracker: https://bitbucket.org/marcinkuzminski/rhodecode/issues
 
.. _google group rhodecode: http://groups.google.com/group/rhodecode
 
\ No newline at end of file
 
.. _google group rhodecode: http://groups.google.com/group/rhodecode
rhodecode/lib/indexers/__init__.py
Show inline comments
 
@@ -20,193 +20,193 @@
 
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 
# GNU General Public License for more details.
 
#
 
# You should have received a copy of the GNU General Public License
 
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
import os
 
import sys
 
import traceback
 
from os.path import dirname as dn, join as jn
 

	
 
#to get the rhodecode import
 
sys.path.append(dn(dn(dn(os.path.realpath(__file__)))))
 

	
 
from string import strip
 
from shutil import rmtree
 

	
 
from whoosh.analysis import RegexTokenizer, LowercaseFilter, StopFilter
 
from whoosh.fields import TEXT, ID, STORED, Schema, FieldType
 
from whoosh.index import create_in, open_dir
 
from whoosh.formats import Characters
 
from whoosh.highlight import highlight, SimpleFragmenter, HtmlFormatter
 

	
 
from webhelpers.html.builder import escape
 
from sqlalchemy import engine_from_config
 
from vcs.utils.lazy import LazyProperty
 

	
 
from rhodecode.model import init_model
 
from rhodecode.model.scm import ScmModel
 
from rhodecode.model.repo import RepoModel
 
from rhodecode.config.environment import load_environment
 
from rhodecode.lib import LANGUAGES_EXTENSIONS_MAP
 
from rhodecode.lib.utils import BasePasterCommand, Command, add_cache
 

	
 
#EXTENSIONS WE WANT TO INDEX CONTENT OFF
 
INDEX_EXTENSIONS = LANGUAGES_EXTENSIONS_MAP.keys()
 

	
 
#CUSTOM ANALYZER wordsplit + lowercase filter
 
ANALYZER = RegexTokenizer(expression=r"\w+") | LowercaseFilter()
 

	
 

	
 
#INDEX SCHEMA DEFINITION
 
SCHEMA = Schema(owner=TEXT(),
 
                repository=TEXT(stored=True),
 
                path=TEXT(stored=True),
 
                content=FieldType(format=Characters(ANALYZER),
 
                             scorable=True, stored=True),
 
                modtime=STORED(), extension=TEXT(stored=True))
 

	
 

	
 
IDX_NAME = 'HG_INDEX'
 
FORMATTER = HtmlFormatter('span', between='\n<span class="break">...</span>\n')
 
FRAGMENTER = SimpleFragmenter(200)
 

	
 

	
 
class MakeIndex(BasePasterCommand):
 

	
 
    max_args = 1
 
    min_args = 1
 

	
 
    usage = "CONFIG_FILE"
 
    summary = "Creates index for full text search given configuration file"
 
    group_name = "RhodeCode"
 
    takes_config_file = -1
 
    parser = Command.standard_parser(verbose=True)
 

	
 
    def command(self):
 

	
 
        from pylons import config
 
        add_cache(config)
 
        engine = engine_from_config(config, 'sqlalchemy.db1.')
 
        init_model(engine)
 

	
 
        index_location = config['index_dir']
 
        repo_location = self.options.repo_location if self.options.repo_location else RepoModel().repos_path
 
        repo_list = map(strip, self.options.repo_list.split(',')) \
 
            if self.options.repo_list else None
 

	
 
        #======================================================================
 
        # WHOOSH DAEMON
 
        #======================================================================
 
        from rhodecode.lib.pidlock import LockHeld, DaemonLock
 
        from rhodecode.lib.indexers.daemon import WhooshIndexingDaemon
 
        try:
 
            l = DaemonLock(file=jn(dn(dn(index_location)), 'make_index.lock'))
 
            WhooshIndexingDaemon(index_location=index_location,
 
                                 repo_location=repo_location,
 
                                 repo_list=repo_list)\
 
                .run(full_index=self.options.full_index)
 
            l.release()
 
        except LockHeld:
 
            sys.exit(1)
 

	
 
    def update_parser(self):
 
        self.parser.add_option('--repo-location',
 
                          action='store',
 
                          dest='repo_location',
 
                          help="Specifies repositories location to index REQUIRED",
 
                          help="Specifies repositories location to index OPTIONAL",
 
                          )
 
        self.parser.add_option('--index-only',
 
                          action='store',
 
                          dest='repo_list',
 
                          help="Specifies a comma separated list of repositores "
 
                                "to build index on OPTIONAL",
 
                          )
 
        self.parser.add_option('-f',
 
                          action='store_true',
 
                          dest='full_index',
 
                          help="Specifies that index should be made full i.e"
 
                                " destroy old and build from scratch",
 
                          default=False)
 

	
 
class ResultWrapper(object):
 
    def __init__(self, search_type, searcher, matcher, highlight_items):
 
        self.search_type = search_type
 
        self.searcher = searcher
 
        self.matcher = matcher
 
        self.highlight_items = highlight_items
 
        self.fragment_size = 200 / 2
 

	
 
    @LazyProperty
 
    def doc_ids(self):
 
        docs_id = []
 
        while self.matcher.is_active():
 
            docnum = self.matcher.id()
 
            chunks = [offsets for offsets in self.get_chunks()]
 
            docs_id.append([docnum, chunks])
 
            self.matcher.next()
 
        return docs_id
 

	
 
    def __str__(self):
 
        return '<%s at %s>' % (self.__class__.__name__, len(self.doc_ids))
 

	
 
    def __repr__(self):
 
        return self.__str__()
 

	
 
    def __len__(self):
 
        return len(self.doc_ids)
 

	
 
    def __iter__(self):
 
        """
 
        Allows Iteration over results,and lazy generate content
 

	
 
        *Requires* implementation of ``__getitem__`` method.
 
        """
 
        for docid in self.doc_ids:
 
            yield self.get_full_content(docid)
 

	
 
    def __getitem__(self, key):
 
        """
 
        Slicing of resultWrapper
 
        """
 
        i, j = key.start, key.stop
 

	
 
        slice = []
 
        for docid in self.doc_ids[i:j]:
 
            slice.append(self.get_full_content(docid))
 
        return slice
 

	
 

	
 
    def get_full_content(self, docid):
 
        res = self.searcher.stored_fields(docid[0])
 
        f_path = res['path'][res['path'].find(res['repository']) \
 
                             + len(res['repository']):].lstrip('/')
 

	
 
        content_short = self.get_short_content(res, docid[1])
 
        res.update({'content_short':content_short,
 
                    'content_short_hl':self.highlight(content_short),
 
                    'f_path':f_path})
 

	
 
        return res
 

	
 
    def get_short_content(self, res, chunks):
 

	
 
        return ''.join([res['content'][chunk[0]:chunk[1]] for chunk in chunks])
 

	
 
    def get_chunks(self):
 
        """
 
        Smart function that implements chunking the content
 
        but not overlap chunks so it doesn't highlight the same
 
        close occurrences twice.
 
        
 
        :param matcher:
 
        :param size:
 
        """
 
        memory = [(0, 0)]
 
        for span in self.matcher.spans():
 
            start = span.startchar or 0
 
            end = span.endchar or 0
 
            start_offseted = max(0, start - self.fragment_size)
 
            end_offseted = end + self.fragment_size
 

	
 
            if start_offseted < memory[-1][1]:
 
                start_offseted = memory[-1][1]
0 comments (0 inline, 0 general)