Myghty provides the ability to cache any kind of data, including component output, component return values, and user-defined data structures, for fast re-retrieval. All components, whether file-based or subcomponent, are provided with their own cache namespace, which in turn stores any number of key/value pairs. Included with Myghty are implementations using files, dbm files, direct in-memory dictionaries, and memcached. User-defined implementations are also supported and can be specified at runtime. The particular implementation used can optionally be specified on a per-key basis, so that any single namespace can store individual values across any number of implementations at the same time.
Caching is generally used when there are process-intensive and/or slow operations that must be performed to produce output, and those operations also have little or no dependencies on external arguments, i.e. their value is not likely to change in response to request variables. Examples of things that are good for caching include weather information, stock data, sports scores, news headlines, or anything other kind of data that can be expensive to retrieve and does not need to be updated with real-time frequency.
The front-end to the mechanism is provided by the myghty.cache package, and can be configured and used through global configuration parameters, per-component flags, and a programmatic interface.
The simplest form of caching is the caching of a component's return value and output content via flags. An example using a subcomponent (subcomponents are explained in How to Define a Subcomponent):
<%def heavytext> <%flags> use_cache = True cache_type = 'dbm' cache_expiretime = 30 </%flags> <%init> data = big_long_query() </%init> Your big query returned: <% data.get_info() %> </%def>
In this example, the component's output text and its return value (if any) will be stored the first time it is called. Any calls to this component within the next 30 seconds will automatically return the cached value, and the %init section and body will not be executed. At the moment 30 seconds have elapsed, the first call to occur within this new period will result in the component executing in full again and recreating its output text and return value. Subsequent calls that occur during this second execution will continue to return the prior value until the new value is complete. Once the new value is complete, it is stored in the cache and the expiration counter begins again, for the next 30 seconds.
Note that the parameter cache_type is set to 'dbm', which indicates that dbm-based caching is used. This is the default setting when a data_dir parameter is configured with the Myghty application.
For components that contain a %filter section, the result of filtering is stored in the cache as well. This allows the cache to be useful in limiting the execution of a process-intensive or time-consuming filtering function.
When a component is recompiled, its cache contents are automatically expired, so that the cache can be refreshed with the value returned by the newly modified component. This means it is safe to set a cache setting with no expire time at all for a component whose output never changes, and in fact such a component only executes once per compilation and never at all again, for the life of the cache.
The traditional caching interface looks like this:
<%init> def create_data(): return get_some_data() cache = m.get_cache() data = cache.get_value('mydata', type='memory', createfunc=create_data, expiretime=60) </%init>
The creation function argument is optional, and the cache can be populated externally as well:
<%init> cache = m.get_cache() if not cache.has_key('mydata'): cache.set_value('mydata', get_some_data(), expiretime=60) data = cache.get_value('mydata') </%init>
To programmatically cache the output text of a component, use the m.cache_self() method on request, which is a reentrant component-calling method:
<%init> if m.cache_self(key="mykey"): return </%init> # rest of component
To get the component's return value via this method:
<%init> ret = Value() if m.cache_self(key="mykey", retval = ret): return ret() # rest of component return 3 + 5 </%init>
Generally, the %flags method of caching a component's output and return value is a lot easier than the programmatic interface. The main advantage of the programmatic interface is if the actual key is to be programmatically decided based on component arguments it can be figured out at runtime and sent as the "key" argument. This also applies if any of the other properties of the cache are to be determined at run-time rather than compile time.
The cached information may be shared within the scope of one process or across multiple processes. Synchronization mechanisms are used to insure that the regeneration is only called by one thread of one process at a time, returning the expired value to other processes and threads while the regeneration is occuring. This maximizes performance even for a very slow data-regeneration mechanism. In the case of a non-memory-based cache, an external process can also access the same cache.
Note that Myghty only includes thread-scoped synchronization for the Windows platform (contributors feel free to contribute a Win32 file locking scheme). The "file" and "dbm" cache methodologies therefore may not be entirely synchronized across multiple processes on Windows. This only occurs if multiple servers are running against the same cache since Windows doesnt have any forking capability and therefore an Apache server or similar is only using threads.
Caching has an assortment of container methodolgies, such as MemoryContainer and DBMContainer, and provides a base Container class that can be subclassed to add new methodologies. A single component's cache can have containers of any number of different types and configurations.
Caching of the URI resolution step can also be done to improve performance. See use_static_source for more information on using the URICache.
Caching options are all specified as Myghty configuration parameters in the form cache_XXXX, to identify them as options being sent to the Cache object. When calling the m.get_cache() method, parameters may be specified with or without the cache_ prefix; they are stripped off. While some cache options apply to the Cache object itself, others apply to specific forms of the Container class, the two current classes being MemoryContainer and DBMContainer.
The full list of current options is as follows: