Files @ caef0be39948
Branch filter:

Location: kallithea/CONTRIBUTORS

FUJIWARA Katsunori
search: make "repository:" condition work as expected

Before this revision, "repository:foo" condition at searching for
"File contents" or "File names" shows files in repositories below.

- foo
- foo/bar
- foo-bar
- and so on ...

Whoosh library, which is used to parse text for indexing and seaching,
does:

- treat almost all non-alphanumeric characters as delimiter both at
indexing search items and at parsing search condition
- make each fields for a search item be indexed by multiple values

For example, files in "foo/bar" repository are indexed by "foo" and
"bar" in "repository" field. This tokenization make "repository:foo"
search condition match against files in "foo/bar" repository, too.

In addition to it, using plain TEXT also causes unintentional
ignorance of "stop words" in search conditions. For example, "this",
"a", "you", and so on are ignored at indexing and parsing, because
these are too generic words (from point of view of generic "text
search").

This issue can't be resolved by using ID instead of TEXT for
"repository" of SCHEMA, like as previous revisions for JOURNAL_SCHEMA,
because:

- highlight-ing file content requires SCHEMA to support "positions"
feature, but using ID instead of TEXT disables it
- using ID violates current case-insensitive search policy, because
it preserves case of text

To make "repository:" condition work as expected, this revision
explicitly specifies "analyzer", which does:

- avoid tokenization
- match case-insensitively
- avoid removing "stop words" from text

This revision requires full re-building index tables, because indexing
schema is changed.

BTW, "repository:" condition at searching for "Commit messages" uses
CHGSETS_SCHEMA instead of SCHEMA. The former uses ID for "repository",
and it does:

- avoid issues by tokenization and removing "stop words"

- disable "positions" feature of CHGSETS_SCHEMA

But highlight-ing file content isn't needed at searching for
"Commit messages". Therefore, this can be ignored.

- preserve case of text

This violates current case-insensitive search policy, This issue
will be fixed by subsequent revision, because fixing it isn't so
simple.
List of contributors to Kallithea project:

    Mads Kiilerich <madski@unity3d.com> 2012-2016
    Takumi IINO <trot.thunder@gmail.com> 2012-2016
    Unity Technologies 2012-2016
    Andrew Shadura <andrew@shadura.me> 2012 2014-2016
    Dominik Ruf <dominikruf@gmail.com> 2012 2014-2016
    Thomas De Schampheleire <thomas.de.schampheleire@gmail.com> 2014-2016
    Étienne Gilli <etienne.gilli@gmail.com> 2015-2016
    Jan Heylen <heyleke@gmail.com> 2015-2016
    Robert Martinez <ntttq@inboxen.org> 2015-2016
    Robert Rauch <mail@robertrauch.de> 2015-2016
    Søren Løvborg <sorenl@unity3d.com> 2015-2016
    Angel Ezquerra <angel.ezquerra@gmail.com> 2016
    Asterios Dimitriou <steve@pci.gr> 2016
    Kateryna Musina <kateryna@unity3d.com> 2016
    Konstantin Veretennicov <kveretennicov@gmail.com> 2016
    Oscar Curero <oscar@naiandei.net> 2016
    Robert James Dennington <tinytimrob@googlemail.com> 2016
    timeless@gmail.com 2016
    YFdyh000 <yfdyh000@gmail.com> 2016
    Aras Pranckevičius <aras@unity3d.com> 2012-2013 2015
    Sean Farley <sean.michael.farley@gmail.com> 2013-2015
    Christian Oyarzun <oyarzun@gmail.com> 2014-2015
    Joseph Rivera <rivera.d.joseph@gmail.com> 2014-2015
    Michal Čihař <michal@cihar.com> 2014-2015
    Anatoly Bubenkov <bubenkoff@gmail.com> 2015
    Andrew Bartlett <abartlet@catalyst.net.nz> 2015
    Balázs Úr <urbalazs@gmail.com> 2015
    Ben Finney <ben@benfinney.id.au> 2015
    Branko Majic <branko@majic.rs> 2015
    Daniel Hobley <danielh@unity3d.com> 2015
    David Avigni <david.avigni@ankapi.com> 2015
    Denis Blanchette <dblanchette@coveo.com> 2015
    duanhongyi <duanhongyi@doopai.com> 2015
    EriCSN Chang <ericsning@gmail.com> 2015
    Grzegorz Krason <grzegorz.krason@gmail.com> 2015
    Jiří Suchan <yed@vanyli.net> 2015
    Kazunari Kobayashi <kobanari@nifty.com> 2015
    Kevin Bullock <kbullock@ringworld.org> 2015
    kobanari <kobanari@nifty.com> 2015
    Marc Abramowitz <marc@marc-abramowitz.com> 2015
    Marc Villetard <marc.villetard@gmail.com> 2015
    Matthias Zilk <matthias.zilk@gmail.com> 2015
    Michael Pohl <michael@mipapo.de> 2015
    Michael V. DePalatis <mike@depalatis.net> 2015
    Morten Skaaning <mortens@unity3d.com> 2015
    Nick High <nick@silverchip.org> 2015
    Niemand Jedermann <predatorix@web.de> 2015
    Peter Vitt <petervitt@web.de> 2015
    Ronny Pfannschmidt <opensource@ronnypfannschmidt.de> 2015
    Sam Jaques <sam.jaques@me.com> 2015
    Tuux <tuxa@galaxie.eu.org> 2015
    Viktar Palstsiuk <vipals@gmail.com> 2015
    Ante Ilic <ante@unity3d.com> 2014
    Bradley M. Kuhn <bkuhn@sfconservancy.org> 2014
    Calinou <calinou@opmbx.org> 2014
    Daniel Anderson <daniel@dattrix.com> 2014
    Henrik Stuart <hg@hstuart.dk> 2014
    Ingo von Borstel <kallithea@planetmaker.de> 2014
    Jelmer Vernooij <jelmer@samba.org> 2014
    Jim Hague <jim.hague@acm.org> 2014
    Matt Fellows <kallithea@matt-fellows.me.uk> 2014
    Max Roman <max@choloclos.se> 2014
    Na'Tosha Bard <natosha@unity3d.com> 2014
    Rasmus Selsmark <rasmuss@unity3d.com> 2014
    Tim Freund <tim@freunds.net> 2014
    Travis Burtrum <android@moparisthebest.com> 2014
    Zoltan Gyarmati <mr.zoltan.gyarmati@gmail.com> 2014
    Marcin Kuźmiński <marcin@python-works.com> 2010-2013
    xpol <xpolife@gmail.com> 2012-2013
    Aparkar <aparkar@icloud.com> 2013
    Dennis Brakhane <brakhane@googlemail.com> 2013
    Grzegorz Rożniecki <xaerxess@gmail.com> 2013
    Jonathan Sternberg <jonathansternberg@gmail.com> 2013
    Leonardo Carneiro <leonardo@unity3d.com> 2013
    Magnus Ericmats <magnus.ericmats@gmail.com> 2013
    Martin Vium <martinv@unity3d.com> 2013
    Simon Lopez <simon.lopez@slopez.org> 2013
    Ton Plomp <tcplomp@gmail.com> 2013
    Augusto Herrmann <augusto.herrmann@planejamento.gov.br> 2011-2012
    Dan Sheridan <djs@adelard.com> 2012
    Dies Koper <diesk@fast.au.fujitsu.com> 2012
    Erwin Kroon <e.kroon@smartmetersolutions.nl> 2012
    H Waldo G <gwaldo@gmail.com> 2012
    hppj <hppj@postmage.biz> 2012
    Indra Talip <indra.talip@gmail.com> 2012
    mikespook 2012
    nansenat16 <nansenat16@null.tw> 2012
    Philip Jameson <philip.j@hostdime.com> 2012
    Raoul Thill <raoul.thill@gmail.com> 2012
    Stefan Engel <mail@engel-stefan.de> 2012
    Tony Bussieres <t.bussieres@gmail.com> 2012
    Vincent Caron <vcaron@bearstech.com> 2012
    Vincent Duvert <vincent@duvert.net> 2012
    Vladislav Poluhin <nuklea@gmail.com> 2012
    Zachary Auclair <zach101@gmail.com> 2012
    Ankit Solanki <ankit.solanki@gmail.com> 2011
    Dmitri Kuznetsov 2011
    Jared Bunting <jared.bunting@peachjean.com> 2011
    Jason Harris <jason@jasonfharris.com> 2011
    Les Peabody <lpeabody@gmail.com> 2011
    Liad Shani <liadff@gmail.com> 2011
    Lorenzo M. Catucci <lorenzo@sancho.ccd.uniroma2.it> 2011
    Matt Zuba <matt.zuba@goodwillaz.org> 2011
    Nicolas VINOT <aeris@imirhil.fr> 2011
    Shawn K. O'Shea <shawn@eth0.net> 2011
    Thayne Harbaugh <thayne@fusionio.com> 2011
    Łukasz Balcerzak <lukaszbalcerzak@gmail.com> 2010
    Andrew Kesterson <andrew@aklabs.net>
    cejones
    David A. Sjøen <david.sjoen@westcon.no>
    James Rhodes <jrhodes@redpointsoftware.com.au>
    Jonas Oberschweiber <jonas.oberschweiber@d-velop.de>
    larikale
    RhodeCode GmbH
    Sebastian Kreutzberger <sebastian@rhodecode.com>
    Steve Romanow <slestak989@gmail.com>
    SteveCohen
    Thomas <thomas@rhodecode.com>
    Thomas Waldmann <tw-public@gmx.de>