Zend Form revisited

Posted in development on January 3rd, 2009 by admin

Some time ago I introduced declarative creation of Zend Form instances. But what should you do if you have two or more forms that have common parts? Yes, you can copy-paste form files but adding extension mechanism is much easier and much more elegant :)

You will need only a couple changes to your code:

function exclude_duplicates($var) {

    if( is_array($var) && count($var) > 0 ) {
        $cnt = count($var) - 1;
        if( array_values($var) === $var ) {
            return exclude_duplicates( $var[$cnt] );

        $keys   = array_keys($var);
        $values = array_map( 'exclude_duplicates', array_values($var));
        return array_combine($keys, $values);

    return $var;

class KB_Form extends Zend_Form {

        public function setOptions(array $options) {
        $result = array();

        if( isset($options['extends']) ) {
            $form_name = $options['extends'];

            $parent_options = KB_Form_Factory::getInstance()->getFormConfig($form_name);
            if( $parent_options == null ) {
                throw new KB_Exception('Cannot obtain form '.$form_name.' config.', KB_Exception::KB_FORM);
            $result = array_merge_recursive($parent_options->toArray(), $options);
            $result = array_combine(array_keys($result), array_map( 'exclude_duplicates', array_values($result) ) );
        } else {
            $result = $options;


        if (isset($result['bo_class'])) {
        return $this;
    public function addElement($element, $name = null, $options = null) {
        parent::addElement( $element, $name, $options );

        if( $options !== null && isset($options['bo_method'])) {
            $name = $e->getName();
            if( ! isset($this->_bo_methods[$name]) ) {
                $this->_bo_methods[$name] = $spec['bo_method'];

And now if you have form named ‘Search_Index’ you can derive your new form from it by adding one simple line:

; general form meta information
search.kanji.extends = "Search_Index"

You can override properties from parent form by simply adding them to your new file:

: override order
search.kanji.elements.submit.options.order = 6

That’s it for today…

Tags: ,

Porter2 and regexp kung-fu

Posted in development on December 23rd, 2008 by admin

To improve search quality I needed stemming algorithm. Porter2 seemed to be the best choice. However I realized that the only reference implementation exists is written on Snowball.

Now I’ll be throwing stones to Snowball. I really cannot get people who handcrafted this language. Its unreadability can be compared to perl, but the syntax and expression possibilities are really limited.

Can you tell me for sure what this piece of code is doing?

[substring] among (
'eed' 'eedly'
(R1 <-'ee')
'ed' 'edly' 'ing' 'ingly'
test gopast v delete
test substring among(
'at' 'bl' 'iz'
(<+ 'e')
'bb' 'dd' 'ff' 'gg' 'mm' 'nn' 'pp' 'rr' 'tt'
([next] delete)
'' (atmark p1 test shortv <+ 'e'

So I finally sat and implemented PHP5 version of this algorithm.

Read more »

Tags: ,

Zend Forms

Posted in development on December 4th, 2008 by admin

It’s being a while since my last Zend-related post. Now I’ll try to cover approach to Zend_Form usage. This component has really extensive functionality. You can use Zend_Form in a straightforward manner by creating form elements in your controller’s code:

protected function prepareForm() {
    $form = new Zend_Form();

    $query = new Zend_Form_Element_Text('query');

    // ...

    $submit = new Zend_Form_Element_Submit('submit');

    return $form;

Read more »

Tags: ,

Zend & Smarty – ステップニ

Posted in development on October 13th, 2008 by admin

In this post I will reveal the secret of really smart integration between Zend and Smarty :) Since I’m too lazy to write 10 similar functions inside the Smarty plugin, I decided to modify the Smarty compiler. And it worked well.

Making Smarty zend-aware

This step is simple – you’re just adding a function that allows you to call Zend View Helper using

class KB_SmartyZendAware extends Smarty {
    private $_zendView                 = null;
    private $_cfg;

    public function __construct() {
        $this->compiler_class = 'KB_SmartyZendAwareCompiler';

    public function setZendView(Zend_View_Abstract $view) {
        if( $view === null ) {
            throw new KB_Exception('Zend_View cannot be null.', KB_Exception::KB_SMARTY);
        $this->_zendView = $view;

    public function callZendViewHelper( $name, $method, $args ) {
        if( $this->_zendView === null || ! is_string($name) || strlen($name) == 0 ) {
            return '';
        $helper = $this->_zendView->getHelper($name);

        if( ! is_string($method) || strlen($method) == 0 ) {
            return call_user_func_array( array( $helper, $name ), $args);
        } else {
            return call_user_func_array( array( call_user_func( array($helper, $name) ), $method ), $args);

Read more »

Tags: , ,

Zend & Smarty – ステップワン

Posted in development on October 9th, 2008 by admin

It’s been a while since I updated the blog. But the things were pretty busy lately…
So finally the application’s skeleton is in place. It uses Zend + Smarty. I’m also done with
Tanaka corpus parsing. I can only say that APR is really easy to use.

Here I won’t explain in details how to integrate Zend View and Smarty. This is actually
pretty easy. You may refer a pretty old post of Ralf Eggert ‘Integrating Smarty with the Zend Framework‘ or read Quentin Zervaas’s ‘Practical Web 2.0 Applications with PHP‘.

After you nailed this most likely you’ll start thinking about something more elaborate, capable of supporting Zend_View helpers such as headScript, doctype etc. And most probably you will end up with a plugin class that maps Zend_View helpers to Smarty custom functions.

Mine code looked like this:

Read more »

Tags: , ,

Saxon thrashes Altova…

Posted in development on September 22nd, 2008 by admin

This Sunday I was busy trying to optimize data load process. In fact I ended up by completely rewriting the stylesheets. During this process I had a chance to compare performance of 2 XSLT processors I use: Altova XSLT and Saxon-B. The results are nonpresumable.

What you can find below is not a real benchmark. I simply took an average execution time calculated after 3 test runs on some of the stylesheets I use.

  Saxon-B1 (compiled) Saxon-B2 Saxon-SA  AltovaXSLT 
Stylesheet #1 (input file: 45 Mb) 11.2183 11.156 11.484 108.031
Stylesheet #2 (input file: 1.5Mb) 3.296 3.171 4.484 153.671
Stylesheet #3 (input file: 70Mb) N/A 77.453 N/A ERR_OOM

[1] – Compiled stylesheet was used instead of raw XSLT. And these results lead us to the interesting conclusion: popular assumption that a product that compiles to bytecode will be necessarily faster than an interpreter is WRONG. I will try to cover this topic in my next posts.

[2] – Settings for all saxon runs are as following: -l:off -dtd:off -tree:tiny

[3] – All results are in seconds

Well, in some cases Saxon, which is pure Java, is up to 48 times slower than pure C++. Moreover Altova consumes enormous amounts of memory failing to process relatively small files (approximately 45 – 70 Mb) on a 32-bit machine, while Saxon uses around 300Mb regardless of the input file size.

So right now the dictionary is processed in 77 seconds and loaded into the DB in less than 5 seconds. Not bad I think…


Got data?

Posted in development on September 16th, 2008 by admin

Yesterday I finished the first version of fancy XSLT 2.0 stylesheet that transforms JMDict into insert statements; today I tried to run it on a 70Mb file. Results are as following:

  • AltovaXML – allocated 1.5Gb or RAM and died after 30 minutes.
  • MSXSL – crashed after 15 minutes.
  • Saxon SA – died immediately with class cast exception.
  • Saxon-B – gobbled 300Mb of RAM and successfully finished processing in 218 minutes.

15 minutes of work and template was processed by Saxon in about an hour. Still not good. The reason is: enormous amount of cross references that require iterating over the whole document to find exact match.

By the time I got there I buried the idea to load data using inserts. We’ll use bulk load. So tomorrow another optimization round is expected. However out of curiosity I tried to import ‘insert-like’ data. It took approximately 6 hours :)