Code snippets for symfony 1.x


Refine Tags

Snippets tagged "search"

Adding stemming to Doctrine Searchable

Stemming reduces a word to its common stem, so if you search on Google for "Knit", you'll also be searching for "Knits" and "Knitting". Sadly, Doctrine's Searchable behaviour doesn't support stemming out of the box, but it's pretty easy to add.

I've chosen to use the PECL 'stem' extension in this implementation because it contains the excellent Porter2 stemmer. You may be able to find a free pure PHP implementation of the algorithm somewhere, or decide that the original Porter stemmer is good enough. I couldn't.

You'll need to compile and install the PECL stem extension, and make sure it's loaded.

The stemming analyzer class

Create lib/MyAnalyzer.class.php:

    class MyAnalyzer extends Doctrine_Search_Analyzer_Standard
      public function analyze($text)
        // First run the standard analyzer. This will do a lot of tidying on 
        // the text and remove stopwords, so we only need to do the stemming
        $text = parent::analyze($text);
        foreach ($text as &$keyword) {
          $keyword = stem_english($keyword);
        return $text;

Making your project use it

First, create a doctrine listener, lib/MyConnectionListener.class.php

    class MyConnectionListener extends Doctrine_EventListener
      public function postConnect(Doctrine_Event $e)
          $entity_table = Doctrine_Core::getTable('Entity');
                       ->setOption('analyzer', new MyAnalyzer);

Then edit your ProjectConfiguration class:

    public function configureDoctrineConnection(Doctrine_Connection $conn)
      $conn->addListener(new MyConnectionListener)

Finally, in your frontend code, make sure you stem your search terms before running a query.

    $terms = 'my search string';
    $stemmer = new MyAnalyzer();
    $terms = join($stemmer->analyze($terms), ' ');
    $results = Doctrine::getTable('Entity')->search($terms);

A couple of notes:

by Matt Robinson on 2009-11-14, tagged doctrine  search  searchable  stemming