A reasonable tutorial exists at http://www.w3schools.com.
Explore Zend issues here.
These are the guts of PHP—how to extend it such as adding in bindings to the VAS API set. Here are some notes as we go...
All PHP bindings follow a common structure:
There are two ways of extending via an external shared-object module (such as libphp_vas.so.4.3.0). In this discussion, realize that there is an external module, php_vas.so, a link placed in /usr/lib/php/extensions to libphp_vas.so.4.3.0 and named php_vas.so. (For now, I haven’t been tempted to name our shared object to php_vas.so since it’s actually more explicit to create this link to a file with such as descriptive name—that way, you know what you have got version- and otherwise. A word to the wise sufficeth: VAS changes versions and so does PHP.) In our example here, we posit our VAS bindings as for php5, something we’re working on getting running (php4 is already up and going) at the moment.
if( ! extension_loaded( "php_vas5.so" ) ) { if( ! dl( "php_vas5.so" ) ) return; }
This has the theoretical disadvantage, however, of requiring potentially expensive loading time each time a PHP module is run that needs it. Another way is to annotate php.ini using the extension tag.
The reason our existing bindings failed to compile with php5 is because the zend_function_entry structure is because a) the structure itself is radically different between the two versions with a substructure of some complexity and b) the arguments to the handler (wrapper) are also more numerous in php5:
php4: typedef struct { char *fname; void (*handler)( int ht, (glue-code) zval *return_value, zval *this_ptr, int return_value_used, void ***tsrm_ls ); unsigned char *func_arg_types; } zend_function_entry; php5: typedef struct { char *name; zend_uint name_len; char *class_name; zend_uint class_name_len; zend_bool array_type_hint; zend_bool allow_null; zend_bool pass_by_reference; zend_bool return_reference; int required_num_args; } zend_arg_info; typedef struct { char *fname; void (*handler)( int ht, (glue-code) zval *return_value, zval **return_value_ptr, zval *this_ptr, int return_value_used, void ***tsrm_ls ); zend_arg_info *arg_info; zend_uint num_args; zend_uint flags; } zend_function_entry;
This discovered, I will examine the very binding code to see what must be done to accommodate php5 and, further, to what extent the result may still be single-sourced.
Link to sample extension code: http://devzone.zend.com/manual/view/page/zend.creating.html.
Link to Zend API: Hacking the Core of PHP: http://devzone.zend.com/manual/view/page/zend.html.
Every user-visible function must have an entry in an array. This introduces them to Zend by name as it should appear in PHP (the expected, VAS API name) and the underlying, implementation name (which, for now, is the same but prefixed by zif_—PHP really wants this to be prefixed with zif_ and we should change this later). This information comes from http://devzone.zend.com/manual/view/page/zend.structure.html. This document covers only php4 and there are differences introduced in php5.
void zif_function-name( int ht, zval *return_value, [ zval **return_value_ptr, ] —new in php5 zval *this_ptr, int return_value_used );
Each function needs a definition and, for php5, a filled-in arg_info structure. In each of these functions, there are 5 or 6 arguments (depending on whether this is built for php4 or php5). Most are accessed only through special macros and some are only seen in this code after preprocessing the macros out. Except as marked, each of these is in both php4 and php5. They are:
ht | Number of arguments passed to Zend, obtained only using ZEND_NUM_ARGS(). | |
return_value | Used to pass return values from this function back to PHP using predefined macros. | |
return_value_ptr | (php5) Don’t know (at this point) what this does. It is this argument smack in the middle of the list that caused these bindings to fail to compile originally for php5. | |
(Returning values is discussed in: http://devzone.zend.com/manual/view/page/zend.returning.html.) | ||
this_ptr | Gains access to the object in which function is contained, if used within an object. The VAS APIs aren’t object-oriented, so this isn’t specially used, but some macros in use actually touch it. Apparently, function getThis can be called to get this pointer. | |
return_value_used | Flag indicating whether the return value of the guts of this function will be consumed by the PHP calling code, 0 for won’t be used, 1 indicates that it will be expected. | |
executor_globals | Points to global settings of the Zend engine used only if creating new variables (which VAS doesn’t do). Because at the time of the original implementation, Red Hat platform implementations of PHP didn’t support this argument, it doesn’t appear in INTERNAL_FUNCTION_PARAMETERS (defined in zend.h). |
Looking at the preprocessed code and comparing between php4 and php5 to see the differences, the zend- interfaces, etc., it appears they have not changed over php4. This means that, for the most part, the conversion over to php5 will not entail much real work:
Impediments include:
Lots of stringization (#) and glue macros (##) are used in the formation of the macros consumed in this code both in Zend and in the VAS bindings code.
Macros prefixed with ZEND_- belong to the Zend (PHP) implementation. Those with PHP_ to the PHP implementation. Those beginning with SPE_ belong to the VAS (Kerns) implementation.
Somehow, zm_startup_vas, macro PHP_MINIT_FUNCTION, is called which runs through some initializations, mostly for complex types, registering destructor (object clean-up) functions, using macro SPE_REGISTER_DTOR.
The converse of the initialization function, PHP_MSHUTDOWN_FUNCTION or vm_shutdown_vas, happens to do nothing in the case of VAS.
An example is the product of vas_user_info_t, which must be cleaned up using vas_user_info_free. The VAS API is called by vas_user_t_free at the time the destructor, php_vas_user_t_dtor, is called.
Note: SPEPRINTF is called with a notice that this is happening. How to turn this on for debugging purposes? Define SPE_DEBUG at the top of vasapi.c.
static int le_vas_user_t; static char *PHP_vas_user_t_RES_NAME = "vas_user_t"; // initialization... PHP_MINIT_FUNCTION—zm_startup_vas( ... ); { . . . SPE_REGISTER_DTOR— le_vas_user_t = zend_register_list_destructors_ex( php_vas_user_t_dtor, NULL, PHP_vas_user_t_RES_NAME, module_number ); . . . php_vas_init_globals( &vas_globals ); } // usage (consumption in PHP code)... [vas_user_t *user;] $user = vas_user_init( $ctx, $id, "administrator", 0 ); // destruction (happens when object no longer needed)... typedef struct { SPE_vas_ctx_t *ctx; vas_user_t *raw; // (apparently refers to VAS' allocated object) unsigned char noFree; } SPE_vas_user_t; php_vas_user_t_dtor( zend_rsrc_list *rsrc ); { SPE_vas_user_t *thing = rsrc->ptr; SPEPRINT( "<addr>: Calling vas_user_t_dtor, noFree=N\n" ); if( thing & thing->raw ) { vas_user_t_free( thing->ctx->ctx, thing->raw ); SPE_vas_ctx_t_free( thing->ctx ); // (destructor for vas_ctx_t) _efree( thing ); // (free rest of vas_user_t) } }
The glue (or binding code) for a C library/shared object is written in C, of course, using macros from Zend that call functions prefixed with zend- and do Zend things, as well as specially created ones in the VAS (Kerns) implementation, usually prefixed with SPE_.
Zend macros are generally from -I Zend (Zend/zend_API.h or Zend/zend_list.h). The other macros are at the top of vasapi.c or php_vas.h.
// glue code function definition... typedef struct { char *fname; void (*handler)( int ht, zval *return_value, zval *this_ptr, int return_value_used ); unsigned char *func_arg_types; } zend_function_entry; // The glue-code function definition is (see Wrapped implementations above: void zif_function-name( int ht, zval *return_value, zval *this_ptr, int return_value_used ); ZEND_VAS_NAMED_FUNC( function-name ) { auto-class variables... ...including zval copies of some pointers (objects) SPE_CHECK_ARGS( argument-count ); zend_parse_parameters( argument-count, ... ) ZEND_FETCH_RESOURCE( which ... ); err = vas API call... SPE_SET_VAS_ERR( err ); Check for errors, if none: SPE_CONS_RETURN_VALUE() or RETVAL_STRING(), etc. otherwise: RETURN_NULL(); }
This generates the function definition/header including Zend arguments.
There are no examples of arguments that, like doubles, are more more than standard width, so it’s not clear at this writing whether this is a significant check.
Sets an argument base (to zero) which it bumps by one if this_ptr is non-nil and an object.
The argument count passed to the macro plus what’s in this_ptr->ht and the argument base must be equal. (Explain this.) If not the case, zend_wrong_param_count is called the glue function returns.
This parses out the “real” function arguments from the PHP script invocation. The first string argument appears to be an arbitrary type specification as yet not understood ("rzsl").
Also, zval versions of the arguments, at least for passed in objects, are got for use in other situations. It appears that to pass deeper a vas_ctx_t or vas_id_t, you must use these.
This macro is used to fill out/get an actual VAS object like vas_ctx_t from z zval version thereof. However, the resulting object is still wrapped in a PHP wrapper. For each type of these, there is a corresponding structure definition, but the VAS animal in question is referred to by a field bearing its name (ctx, id, etc.).
Zend macro.
Validate that one of these zval animals isn’t nil.
Sets the global VAS error value to the return, almost always of the actual VAS function, so that PHP can report it back.
In the case where an object was created by the underlying VAS API, a pointer to it gets recorded as a raw data object in a wrapped up object with a special type—the one spoken of in the paragraph on ZEND_FETCH_RESOURCE.
When it has created the object, this macro (or, rather, another it calls) uses ZEND_REGISTER_RESOURCE to register the resource, which is pretty much just a call to zend_register_resource.
Zend macro.
This macro is used in place of SPE_CONS_RETURN_VALUE when the object being returned is merely a string (that will later be cleaned by a call to free). This macro makes a call to ZVAL_STRING which duplicates the string object, recording it in a zval structure for strings. The calling code, in this case the glue code in vas_user_get_sid, disposes of the string passed back from the VAS API as soon as RETVAL_STRING is finished making its own copy, a decided inefficiency.
Zend macro.
Otherwise, there’s an error and we set the return value type to zip and blow out. The VAS error occurring has already been registered by SPE_SET_VAS_ERR for communication to the PHP script above which calls vas_err_get_code as in:
if( vas_err_get_code( $ctx ) != VAS_ERR_SUCCESS ) { print_php_vas_error( $ctx ); exit( 1 ); }
...so it’s not communicated quite the way you might expect. Note importantly that these bindings return (in PHP) complex objects as their “return value”, rather than errors. See sample PHP code from RC php-vas project (or look for and click on “php-vas” at RC website).
First, the glue function as coded...
ZEND_VAS_NAMED_FUNC( vas_user_init ) { SPE_vas_ctx_t *ctx; SPE_vas_id_t *id = NULL; vas_user_t *user = NULL; vas_err_t err; zval *zctx; zval *zId; int flags; const char *szName; int lName; SPE_CHECK_ARGS( 4 ); if( zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "rzsl", &zctx, &zId, &szName, &lName, &flags ) == FAILURE ) { RETURN_NULL(); } ZEND_FETCH_RESOURCE( ctx, SPE_vas_ctx_t*, &zctx, -1, PHP_vas_ctx_t_RES_NAME, le_vas_ctx_t ); if( ! ZVAL_IS_NULL( zId ) ) { ZEND_FETCH_RESOURCE( id, SPE_vas_id_t *, &zId, -1, PHP_vas_id_t_RES_NAME, le_vas_id_t ); } err = vas_user_init( ctx->ctx, RAW( id ), szName, flags, &user ); SPE_SET_VAS_ERR( err ); if( err == VAS_ERR_SUCCESS ) { SPE_CONS_RETURN_VALUE( vas_user_t, user ); } else { RETURN_NULL(); } }
...then, the cleaned up preprocessor output of this function revealing the internals (with interleaved macros for orientation):
void zif_vas_user_init( int ht, zval *rval, zval *this_ptr, int rval_used ) { SPE_vas_ctx_t *ctx; SPE_vas_id_t *id = NULL; vas_user_t *user = NULL; vas_err_t err; zval *zctx; zval *zId; int flags; const char *szName; int lName; int argbase = 0; SPE_CHECK_ARGS( 4 ): if( this_ptr && this_ptr->type == 5 ) argbase++; if( ht + argbase != 4 ) { RETURN_NULL(): zend_wrong_param_count(); return; } if( zend_parse_parameters( ht, "rzsl", &zctx, &zId, &szName, &lName, &flags ) == -1 ) { rval->type = 0; return; } ZEND_FETCH_RESOURCE( ctx, ... ): ctx = (SPE_vas_ctx_t *) zend_fetch_resource( &zctx, -1, "vas_ctx_t", NULL, 1, le_vas_ctx_t); if (!ctx) { rval->type = 0; return; } ZVAL_IS_NULL( zId ): if( !( zId->type == 0 ) ) { ZEND_FETCH_RESOURCE( id, ... ): id = (SPE_vas_id_t *) zend_fetch_resource( &zId, -1, PHP_vas_id_t_RES_NAME, NULL, 1, le_vas_id_t ); if (!id) { rval->type = 0; return; } } err = vas_user_init( ctx->ctx, ( id == NULL ) ? NULL : id->raw, szName, flags, &user ); SPE_SET_VAS_ERR( err ): vas_globals.g_vas_err = err;` if( err == VAS_ERR_SUCCESS ) { SPE_CONS_RETURN_VALUE( vas_user_t, user ): SPE_vas_user_t *thing; thing = ( SPE_vas_user_t *) _emalloc( sizeof( SPE_vas_user_t ) ); thing->ctx = ctx; thing->ctx->referenceCount++; thing->raw = user; thing->noFree = 0; ZEND_REGISTER_RESOURCE( hidden inside SPE_CONS_RETURN_VALUE ): zend_register_resource( rval, thing, le_vas_user_t ); } else { RETURN_NULL(): rval->type = 0; return; } }
Here are the last bits of vas_user_get_sid, which differs in that it returns a string (as discussed higher up). The sid returned from VAS is freed by free instead of efree because it wasn’t created by emalloc in the first place. The first bits of this function follow pretty much the same lines as vas_user_init already shown.
.
.
.
err = vas_user_get_sid( ctx->ctx, RAW( id ), user->raw, &sid );
SPE_SET_VAS_ERR( err );
if( err == VAS_ERR_SUCCESS )
{
RETVAL_STRING( sid, 1 );
free( sid );
return;
}
else
{
RETURN_NULL();
}
}
...and, afterward, the preprocessed version.
.
.
.
err = vas_user_get_sid( ctx->ctx, ( id == NULL ) ? NULL : id->raw, user->raw, &sid );
SPE_SET_VAS_ERR( err ):
vas_globals.g_vas_err = err;
if( err == VAS_ERR_SUCCESS )
{
RETVAL_STRING( sid, 1 ):
char *__s = sid;
rval->value.str.len = strlen(__s);
rval->value.str.val = _estrndup( __s, rval->value.str.len );
rval->type = 3;
free( sid );
return;
}
else
{
RETURN_NULL():
rval->type = 0;
return;
}
}
Because of the careful registration including the VAS API functions that dispose of complex objects (vas_ctx_free, vas_id_free, vas_user_frrr, etc), PHP’s normal destructor mechanism understands that it needs to call these when it’s finished with the object, at least by shut-down time (call to exit run off right brace, etc.).
We have so far found that the most likely errors arise from imbalanced semantics. In one example, vas_user_get_sid, the sid string being returned wasn’t always stable. In the C code by which VAS implements that function (libs/vasapi/user.c), the string returned was, in some cases, created by strdup and, in others, by returning a pointer to the sid field on the vas_user_t structure. Since that field was disposed of already by PHP in one destructor, clean-up of the string return caused a segmentation fault or double free when passed to free.
valgrind was useful in pin-pointing the otherwise mysterious fault, however, only inspection led to solving the bug as I don’t yet have the ability to step through these bindings. Defining SPE_DEBUG at the top of vasapi.c might have been as useful as valgrind.